Tuesday, June 23, 2009

Working with ISO 639-1 language codes in Coldfusion

Recently, I've been working on a project that uses the ISO 639-1 language codes (example: en = English, es = Spanish, etc.) to filter the results of a search. Everything works fine, until someone accidentally provides an invalid language code and no results are returned.

It's a subtle bitch of an error, when you think about it. Someone believes they are performing a search for any item written in Spanish that mentions the movie title, Fight Club, but they've accidentally provided the non-existent language code of "ed." No results are found because there isn't even an "ed" language, and our user walks away with the mistaken impression that hispanics have nothing of interest to say about Fight Club.

(I know this example seems silly, but there's a convention of using movie titles as examples in Coldfusion training. Try replacing "Fight Club" with "Election" and "hispanic" with "iranians." Doesn't seem nearly as silly now, does it?)

Clearly, if someone provides an invalid language code for a search, we want to notify them there is a problem. Then they can correct the language code and perform the search again. This problem actually consists of two parts: determining if the provided language code is valid, and then how to best notify the user if it is not.

Since the first part of the problem is more interesting to me, and far better coders have written fantastic articles and many blog entries on error handling in Coldfusion, I'm going to focus on the challenge of how to best determine if a provided language code is valid. (For the "UI expert" who reads this and says it's a non-issue because I should just display the language codes in a select/option list-- what if I don't have exclusive control over the interface because it's a web service or component with remotely accessible methods and third party developers are creating their own clients for it?)

Here's my first stab at code that compares ARGUMENTS.lang (i.e. the language code provided by the user) against the official letter codes:
<!--- first we need a list of the various ISO 639-1 language letter codes --->
<cfset VARIABLES.ListISOCodes = "aa,ab,ae,af,ak,am,an,ar,as,av,ay,az,ba,be,bg,bh,bi,bm,bn,bo,br,bs,ca,ce,ch,co,cr,
cs,cu,cv,cy,da,de,dv,dz,ee,el,en,eo,es,et,eu,fa,ff,fi,fj,fo,fr,fy,ga,gd,gl,gn,gu,gv,ha,
he,hi,ho,hr,ht,hu,hy,hz,ia,id,ie,ig,ii,ik,io,is,it,iu,ja,jv,ka,kg,ki,kj,kk,kl,km,kn,ko,kr,
ks,ku,kv,kw,ky,la,lb,lg,li,ln,lo,lt,lu,lv,mg,mh,mi,mk,ml,mn,mr,ms,mt,my,na,nb,nd,
ne,ng,nl,nn,no,nr,nv,ny,oc,oj,om,or,os,pa,pi,pl,ps,pt,qu,rm,rn,ro,ru,rw,sa,sc,sd,se,
sg,si,sk,sl,sm,sn,so,sq,sr,ss,st,su,sv,sw,ta,te,tg,th,ti,tk,tl,tn,to,tr,ts,tt,tw,ty,ug,uk,
ur,uz,ve,vi,vo,wa,wo,xh,yi,yo,za,zh,zu">

<!--- checking to see if the optional ISO language code parameter was passed --->
<cfif isDefined("ARGUMENTS.lang")>

<!--- if so, we use ListFind to search for provided lang code in our list --->
<!--- if ListFind returns 0 then we know it didn't find the specified code --->
<cfif (listFind(VARIABLES.ListISOCodes,ARGUMENTS.lang) IS 0)>

<!--- this is where we throw an error, or set our returnvariable to some warning flag/message --->

</cfif>
</cfif>
Sure, it gets the job done. But I think I can do better.