DokuWiki

It's better when it's simple

User Tools

Site Tools


utf-8

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
utf-8 [2014-02-25 09:05] – removed 125.62.111.226utf-8 [2017-11-13 02:49] (current) 114.84.118.11
Line 1: Line 1:
 +====== UTF-8 Encoding ======
  
 +[[DokuWiki]] now stores all its data in UTF-8. To avoid problems, the filenames of the datafiles itself are [[phpfn>urlencode|URL-encoded]] when saved. DokuWiki versions older than release 2005-02-06 used different encodings so the datafiles need to be [[tips:utf8update|reencoded]] when the software is updated. Switching the used encoding to charsets different from UTF-8 is **not** supported.
 +
 +===== Browser Setup for UTF-8 =====
 +
 +All modern browsers do handle UTF-8 encoded web pages - it's one of the few things that actually work as expected in most browsers. If your browser doesn't display some characters correctly, you are probably missing the correct Unicode fonts.
 +
 +Windows users should install the ''Arialuni.TTF'' font from Microsoft. It is included in Microsoft's Office Suite.
 +
 +[[http://debian.org|Debian]] users can read my [[notes>debianfonts|page on fonts]] to learn how to install Unicode fonts correctly.
 +
 +  * [[wp>Unicode and HTML]]
 +  * [[http://tlt.its.psu.edu/suggestions/international/web/unicode.html|Configuring browsers for Unicode]]
 +
 +
 +
 +===== Editing Files =====
 +
 +{{ wiki:np2-bom.png|Save without a BOM in Notepad 2}}
 +
 +If you intend to edit the data files directly or want to create a [[Localization|translation]]. You need to use a UTF-8 aware editor. There are a lot of capable editors out there, I just want to recommend two small, simple, and free ones here if you still need one ((This is neither intended to be a complete list of Unicode editors, nor as a selection of the best available choices. It's just two small editors I did like. Please do **not** add more editors.)) :
 +
 +  * [[http://tea.linux.kiev.ua/|TEA]] -- a GTK2 based editor for GNU/Linux
 +  * [[http://www.flos-freeware.ch/notepad2.html|Notepad2]] -- a very good notepad replacement for Windows
 +
 +Please note: DokuWiki does __not__ use a [[wp>Byte Order Mark]] and you should make sure your software doesn't, either (especially when editing the PHP and config files).
 +
 +
 +===== batch Encoding file =====
 +
 +  * On Unix use http://www.gnu.org/software/libiconv/
 +  * On Window use recode, a port of iconv: http://recode.progiciels-bpi.ca/archives
 +    * Example of a simple conversion for french local computer:<code>recode lat1..u8 test.txt</code>with ''lat1'' the source charset and ''u8'' the conversion charset for UTF-8.
 +    * To batch the conversion on Windows use this (conversion of all the files in sub-directory)<code>FOR /F "tokens=*" %%G IN ('dir/b/S/X ^"C:\yourpath\*.txt^"') DO recode -v lat1..u8 %%~sG</code>
 +  * More explanation there: [[http://jeanmarcmassou.free.fr/dokuwiki/doku.php?id=blogw:batch_on_win_for_utf_conversion|the link]]
 +
 +
 +
 +===== Examples =====
 +
 +Below are some examples of UTF-8 characters to check your browser((copied from http://www.eleves.ens.fr:8080/home/madore/misc/unitest/)).
 +
 +Zodiac Signs: ♈ ♉ ♊ ♋ ♌ ♍ ♎ ♏ ♐ ♑ ♒ ♓
 +
 +A chessboard:
 +
 +^   ^ A ^ B ^ C ^ D ^ E ^ F ^ G ^ H ^
 +^ 8 | ♜ | ♞ | ♝ | ♛ | ♚ | ♝ | ♞ | ♜ |
 +^ 7 | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | ♟ | 
 +^ 6 |      |        |      |   
 +^ 5 |      |        |      |   
 +^ 4 |      |        |      |   
 +^ 3 |      |        |      |   
 +^ 2 | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ | ♙ |
 +^ 1 | ♖ | ♘ | ♗ | ♕ | ♔ | ♗ | ♘ | ♖ |
 +
 +Russian (по-русски):
 +
 +  По оживлённым берегам
 +  Громады стройные теснятся
 +  Дворцов и башен; корабли
 +  Толпой со всех концов земли
 +  К богатым пристаням стремятся;
 +
 +Ancient Greek:
 +
 +Αρχαίο Πνεύμα Αθάνατον!  
 +Ἰοὺ ἰού· τὰ πάντʼ ἂν ἐξήκοι σαφῆ.
 +  Ὦ φῶς, τελευταῖόν σε προσϐλέψαιμι νῦν,
 +  ὅστις πέφασμαι φύς τʼ ἀφʼ ὧν οὐ χρῆν, ξὺν οἷς τʼ
 +  οὐ χρῆν ὁμιλῶν, οὕς τέ μʼ οὐκ ἔδει κτανών.
 +
 +Modern Greek:
 +  Η σύγχρονη Ελλάδα, έχει να παρουσιάσει δυναμικό
 +  έργο στον τομέα του πολιτισμού, των τεχνών και
 +  των γραμμάτων. Αντίστοιχα δυναμική είναι η παρουσία
 +  των Ελλήνων επιχειρηματιών στην διεθνή οικονομική
 +  και βιομηχανική σκηνή.
 +
 +Sanskrit:
 +
 +  पशुपतिरपि तान्यहानि कृच्छ्राद्
 +  अगमयदद्रिसुतासमागमोत्कः । 
 +  कमपरमवशं न विप्रकुर्युर्
 +  विभुमपि तं यदमी स्पृशन्ति भावाः ॥
 +
 +Hindi:
 +
 +गूगल समाचार हिन्दी में 
 +
 +Korean:
 +  한글은 아름다운 우리글입니다.
 +  곱고 아름답게 사용하는 것이 우리의 의무입니다.
 +
 +Traditional Chinese:
 +
 +  子曰:「學而時習之,不亦說乎?有朋自遠方來,不亦樂乎?
 +  人不知而不慍,不亦君子乎?」
 +  
 +  有子曰:「其為人也孝弟,而好犯上者,鮮矣;
 +  不好犯上,而好作亂者,未之有也。君子務本,本立而道生。
 +  孝弟也者,其為仁之本與!」
 +  
 + Simplified Chinese:
 +
 +  子曰:「学而时习之,不亦说乎?有朋自远方来,不亦乐乎?
 +  人不知而不慍,不亦君子乎?」
 +  
 +  有子曰:「其为人也孝弟,而好犯上者,鲜矣;
 +  不好犯上,而好作乱者,未之有也。君子务本,本立而道生。
 +  孝弟也者,其为仁之本与!」
 +
 +Japanese:
 +
 +  「秋の田の かりほの庵の 苫をあらみ わが衣手は 露にぬれつつ」 天智天皇
 +  「春すぎて 夏来にけらし 白妙の 衣ほすてふ 天の香具山」 持統天皇
 +  「あしびきの 山鳥の尾の しだり尾の ながながし夜を ひとりかも寝む」 柿本人麻呂 
 +
 +Latvian:
 +  
 +  Iedomu jaukie ideāli,
 +  Vecākie principi, tikla, mīla - 
 +  Dienas allažības priekšā
 +  Šķīst kā graudi akmeņstarpā.
 +
 +  Glāžšķūņa rūķīši jautri dziedādami čiepj koncertflīģeļa vāku. 
 +
 +Simplified Chinese :
 +
 +  这是简体字汉语: zhè shì jiǎn tǐ zì hàn yǔ 
 +
 +Armenian:
 +
 +  Հարգանքներիս հավաստիքը Հայ Ժողովրդին:
 +  Ամենալավ օրենքները չեն օգնի, եթե մարդիկ բանի պետք չեն:
 +
 +Persian:
 +
 +  بنی‌آدم اعضای یک‌دیگرند / که در آفرینش ز یک گوهرند
 +  
 +Hebrew:
 +
 +  המשפט עם הזכוכית שאפשר לאכול בלי שזה מפריע, לא זוכר איך הוא הולך
utf-8.1393315526.txt.gz · Last modified: 2014-02-25 09:05 by 125.62.111.226

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki