I am currently working with internationalization of 24SevenOffice.com. We want to support languages like Chinese, Hungarian and even Right-To-Left languages like Arabic. To do so we must use Unicode. On the web the most common Unicode used it UTF-8. For a good introduction to Unicode, UTF-8 and other character sets read ‘The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)’ by Joel Spolsky. And also check out the Unicode site and I18n Guy.
In ASP and HTML there are a couple of things we must to do serve up UTF-8:
ASP CODE:
Response.ContentType = "text/html" Response.AddHeader "Content-Type", "text/html;charset=UTF-8" Response.CodePage = 65001 Response.CharSet = "UTF-8"
and the following HTML META tag:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
We are using Microsoft SQL Server as a database. While it stores data as UCS-2 (Unicode) in Unicode fields (nchar/nvarchar/ntext), I have encountered problems with saving data in Chinese. It seems I have to use N prefix in front of all columns - i.e. UPDATE Table SET Field = N’Unicode Value’;. I am currently checking more around this issue. I really hope I don’t need to do this, if so I must say I am very disappointed with SQL Server - with Oracle this would not have been a problem.
For right-to-left languages we can use a property in CSS called direction, also you can read more about Right-To-Left text in markup languages here.