Table of Contents
UTF-8 Encoding
DokuWiki now stores all its data in UTF-8. To avoid problems, the filenames of the datafiles itself are URL-encoded when saved. DokuWiki versions older than release 2005-02-06 used different encodings so the datafiles need to be reencoded when the software is updated. Switching the used encoding to charsets different from UTF-8 is not supported.
Browser Setup for UTF-8
All modern browsers do handle UTF-8 encoded web pages - it's one of the few things that actually work as expected in most browsers. If your browser doesn't display some characters correctly, you are probably missing the correct Unicode fonts.
Windows users should install the Arialuni.TTF
font from Microsoft. It is included in Microsoft's Office Suite.
Debian users can read my page on fonts to learn how to install Unicode fonts correctly.
Editing Files
If you intend to edit the data files directly or want to create a translation. You need to use a UTF-8 aware editor. There are a lot of capable editors out there, I just want to recommend two small, simple, and free ones here if you still need one 1) :
Please note: DokuWiki does not use a Byte Order Mark and you should make sure your software doesn't, either (especially when editing the PHP and config files).
batch Encoding file
- On Unix use http://www.gnu.org/software/libiconv/
- On Window use recode, a port of iconv: http://recode.progiciels-bpi.ca/archives
- Example of a simple conversion for french local computer:
recode lat1..u8 test.txt
with
lat1
the source charset andu8
the conversion charset for UTF-8. - To batch the conversion on Windows use this (conversion of all the files in sub-directory)
FOR /F "tokens=*" %%G IN ('dir/b/S/X ^"C:\yourpath\*.txt^"') DO recode -v lat1..u8 %%~sG
- More explanation there: the link
Examples
Below are some examples of UTF-8 characters to check your browser2).
Zodiac Signs: ♈ ♉ ♊ ♋ ♌ ♍ ♎ ♏ ♐ ♑ ♒ ♓
A chessboard:
A | B | C | D | E | F | G | H | |
---|---|---|---|---|---|---|---|---|
8 | ♜ | ♞ | ♝ | ♛ | ♚ | ♝ | ♞ | ♜ |
7 | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ |
6 | ||||||||
5 | ||||||||
4 | ||||||||
3 | ||||||||
2 | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ |
1 | ♖ | ♘ | ♗ | ♕ | ♔ | ♗ | ♘ | ♖ |
Russian (по-русски):
По оживлённым берегам Громады стройные теснятся Дворцов и башен; корабли Толпой со всех концов земли К богатым пристаням стремятся;
Ancient Greek:
Αρχαίο Πνεύμα Αθάνατον! Ἰοὺ ἰού· τὰ πάντʼ ἂν ἐξήκοι σαφῆ.
Ὦ φῶς, τελευταῖόν σε προσϐλέψαιμι νῦν, ὅστις πέφασμαι φύς τʼ ἀφʼ ὧν οὐ χρῆν, ξὺν οἷς τʼ οὐ χρῆν ὁμιλῶν, οὕς τέ μʼ οὐκ ἔδει κτανών.
Modern Greek:
Η σύγχρονη Ελλάδα, έχει να παρουσιάσει δυναμικό έργο στον τομέα του πολιτισμού, των τεχνών και των γραμμάτων. Αντίστοιχα δυναμική είναι η παρουσία των Ελλήνων επιχειρηματιών στην διεθνή οικονομική και βιομηχανική σκηνή.
Sanskrit:
पशुपतिरपि तान्यहानि कृच्छ्राद् अगमयदद्रिसुतासमागमोत्कः । कमपरमवशं न विप्रकुर्युर् विभुमपि तं यदमी स्पृशन्ति भावाः ॥
Hindi:
गूगल समाचार हिन्दी में
Korean:
한글은 아름다운 우리글입니다. 곱고 아름답게 사용하는 것이 우리의 의무입니다.
Traditional Chinese:
子曰:「學而時習之,不亦說乎?有朋自遠方來,不亦樂乎? 人不知而不慍,不亦君子乎?」 有子曰:「其為人也孝弟,而好犯上者,鮮矣; 不好犯上,而好作亂者,未之有也。君子務本,本立而道生。 孝弟也者,其為仁之本與!」
Simplified Chinese:
子曰:「学而时习之,不亦说乎?有朋自远方来,不亦乐乎? 人不知而不慍,不亦君子乎?」 有子曰:「其为人也孝弟,而好犯上者,鲜矣; 不好犯上,而好作乱者,未之有也。君子务本,本立而道生。 孝弟也者,其为仁之本与!」
Japanese:
「秋の田の かりほの庵の 苫をあらみ わが衣手は 露にぬれつつ」 天智天皇 「春すぎて 夏来にけらし 白妙の 衣ほすてふ 天の香具山」 持統天皇 「あしびきの 山鳥の尾の しだり尾の ながながし夜を ひとりかも寝む」 柿本人麻呂
Latvian:
Iedomu jaukie ideāli, Vecākie principi, tikla, mīla - Dienas allažības priekšā Šķīst kā graudi akmeņstarpā.
Glāžšķūņa rūķīši jautri dziedādami čiepj koncertflīģeļa vāku.
Simplified Chinese :
这是简体字汉语: zhè shì jiǎn tǐ zì hàn yǔ
Armenian:
Հարգանքներիս հավաստիքը Հայ Ժողովրդին: Ամենալավ օրենքները չեն օգնի, եթե մարդիկ բանի պետք չեն:
Persian:
بنیآدم اعضای یکدیگرند / که در آفرینش ز یک گوهرند
Hebrew:
המשפט עם הזכוכית שאפשר לאכול בלי שזה מפריע, לא זוכר איך הוא הולך