Web i18n
What is Internationalization?
- W3C definition:
Internationalization (I18N) is the design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language.
Localization
- W3C Definition, again:
Localization (L10N) refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale).
Unicode-Unicode-Unicode
- Actually, UTF-8
- Even if you only care about English
- Always declare your encodings <meta charset='utf-8'>
- But avoid BOM prologs where possible
-
http://www.w3.org/International/getting-started/characters
Regional Styles
- Date formats (MDY FTW?)
- Time formats
- Time zones
- Number formats and currencies €1.000.000,00
- Inches, Feet, Yards
- Numbering systems
- http://www.w3.org/International/questions/qa-date-format
Content
- <!doctype html>
- CSS (duh)
- Avoid tight designs
- Many languages much more verbose tha n English
- Especially important in form design
- Use language attributes
- <span lang="xxx">
- Use appropriate tags
- <strong> not <b>
- <em> not <i>
-
http://www.w3.org/International/questions/qa-css-lang
Languages - Middle East
- Middle East: right-to-left content
- Arabic العربية
- Hebrew עִבְרִית
- Mixing RTL and LTR text is challenging
- <p dir='rtl'>
- Learn about Unicode Bi-Di Algorithm
-
http://www.w3.org/International/questions/qa-scripts
Languages - Far East
- Ideographic scripts - CJK
- Traditional vs Simplified Chinese
- 国际化活动、万维网联盟 Simplified
- 國際化活動、萬維網聯盟 Traditional
-
http://rishida.net/scripts/chinese/
- http://www.w3.org/International/questions/qa-css-lang
汉语 日本語 한국어
Languages - Asia
- "Complex" scripts
- Hindi, Tamil, Thai, Burmese, Khmer and many more...
- Gotchas
- Font support = square boxes
- Rendering support = munged text
- http://www.w3.org/International/questions/qa-scripts
অসমীয়া বাংলা हिन्दी ພາສາລາວ മലയാളം मराठी नेपाली ଓଡ଼ିଆ سنڌي සිංහල தமிழ் ภาษาไทย བོད་སྐད
Content Again
- Where possible, avoid text in images
- Learn about culturally-inappropriate content!
- Support Internet Explorer. Even older versions. You are not Google
- http://www.w3.org/2007/Talks/0706-atmedia/slides/Slide0350.html
Constructing Strings
- Use format strings such as “Page %1\d of %2\d”
- Don’t make assumptions such as use of ordinals “1st 2nd 3rd”
- Avoid constructing strings that rely on English grammar rules
- Use full strings instead #phpdotisevil
- http://www.w3.org/International/techniques/authoring-html#strings
Language Selection
- Embedded fonts useful for less-supported scripts
- Or images if lazy
- Don’t use flags for language selection – e.g. India has 20-odd official languages
- Language selection links should be in the target language
- Can you read “ภาษาอังกฤษ”?
- http://bittersmann.de/articles/no-flags/, http://www.w3.org/International/questions/qa-navigation-select
No | No | Yes |
---|---|---|
English French Thai |
ภาษาอังกฤษ ภาษาฝรั่งเศส ภาษาไทย |
English français (French) ภาษาไทย (Thai) |
Names
- Avoid Given/Surname, First/Last Name, because those designations don’t make sense in many parts of the world
- Suggest using Name and Preferred Name
- Make your database fields big
- http://www.w3.org/International/questions/qa-personal-names
Björk Guðmundsdóttir ... Björk
Isa bin Osman ... Mr Isa
毛泽东 ... Mao Ze Dong
María-Jose Carreño Quiñones ... Señorita Carreño
Addresses
Australian addresses are so neat and tidy
Not true elsewhere!
3 The Danes
Park St Lane
Park St
St Albans
Herts AL2 2AY
UNITED KINGDOM
90/3 Soi Ari Samphan 1
Phaholyothin Rd (Soi 5)
Phayathai
Bangkok 10400
THAILAND
Enough
-
That’s scratched the surface
-
Questions?
- Some Resources
-
http://www.w3.org/International
- http://rishida.net/
- https://slid.es/marcdurdin/web-i18n (This presentation)
@MarcDurdin - marc.durdin.net
- keyman.com
Web i18n
By Marc Durdin
Web i18n
- 1,282