notes
Arabic letters on screen
Recently I've decoded to have a go at learning to read and write Arabic. I plan to use the weblog as a language learning tool, most likely approximating a notebook. Over the last couple of days I've spent some time discovering ways of presenting Arabic script via html.
Wikipedia was my first port of call. I came across handy articles about the Arabic Alphabet and Unicode codes. These are great but only provide presentation information about the 'satndalone' forms of the letters. Most Arabic letters have four forms - initial, medial and final as well as the standalone form. These pages pointed me off to the unicode site itself which is full of comprehensive reference material including http://www.unicode.org/charts/. There I discovered detailed lists of entities for every conceivable form including the most obscure ligatures. Excellent. These were accompanied by hex codes and a link to Javascript Unicode Charts. There you can enter a hex code and confirm the character (or browse around) then you can convert it to decimal form for inclusion in the standard html &#xxxx; entity format.
For example one of the charts indicated that FEB8 equates to the 'initial' form of shiin, ie at the beginning of a word. Enter the hex code at the macchiato site, hit return. The clicking the base 10 box immediately changes the numbers to decimals yielding 65208. Loading that into a tag right here in the radio WYSIWYG editing box as ﺸ and as source produces:
ﺸ (if you make sure the font is a unicode one and you pump up the point size.)
The next challenge is to get the direction working in the right one, that is right to left. Apparently the best way to do that is with a SPAN tag using dir=rtl and lang=ar properties I suppose. I'll try this out tomorrow.
update: another excellent resource on unciode topics in general is Alan Wood's collection of unicode resources.


