DokuWiki

It's better when it's simple

User Tools

Site Tools


tips:transliteration

Transliteration of non latin texts

If you would like to convert UTF-8 cyrillic to readable latin, you can dynamically transliterate the content of the <div class="dokuwiki">.

To do so, change the following things:

  • Create a javascript file with the following content and put it in /lib/scripts/translit.js
/* original javascript by Eugene Spearance (http://www.spearance.ru) */
function translit()
{
document.getElementById('dokuwiki').innerHTML=convert(document.getElementById('dokuwiki').innerHTML)
}
var rusChars = new Array('Щ','Ё','Ж','Ч','Ш','Э','Ю','Я','А','Б','В','Г','Д','Е','З','И','Й','К','Л','М','Н','О','П','Р','С','Т','У','Ф','Х','Ц','Ъ','Ы','Ь','щ','ё','ж','ч','ш','э','ю','я','а','б','в','г','д','е','з','и','й','к','л','м','н','о','п','р','с','т','у','ф','х','ц','ъ','ы','Ь');
var transChars = new 
Array('SHH','JO','ZH','CH','SH','JE','JU','JA','A','B','V','G','D','E','Z','I','J','K','L','M','N','O','P','R','S','T','U','F','X','C','#','Y','‘','shh','jo','zh','ch','sh','je','ju','ja','a','b','v','g','d','e','z','i','j','k','l','m','n','o','p','r','s','t','u','f','x','c','#','y','‘');

function convert(from){
	var to = new String();
	var len = from.length;
	var character, isRus;
	for(i=0; i < len; i++){
		character = from.charAt(i,1);
		isRus = false;
		for(j=0; j < rusChars.length; j++){
			if(character == rusChars[j]){
			isRus = true;
			break;
			}
		}
		to += (isRus) ? transChars[j] : character;
	}
	return(to);
}


  • Change the <div class="dokuwiki"> in /lib/tpl/default/main.php to <div class="dokuwiki" id="dokuwiki">
  • Add a <a onclick="translit()" >Translit</a> somewhere in the template at the location that you like.

—-

  • Add the following lines in /inc/template.php in the section where the default JavaScript files are loaded:
  ptln('<script language="javascript" type="text/javascript" charset="utf-8" src="'.
       DOKU_BASE.'lib/scripts/translit.js"></script>',$it);

Please note that this is client-side javascript, and that your browser might take some time/resources to process it. Of course server-side transliteration would be better…

If you would like to use this for transliteration of non cyrillic (other UTF-8 characters, it is easy to get the strings in UTF-8 by just putting them in a page, and then take a look at the raw file in the /data/pages/ directory.

Alternatively you may use an online UTF-8 encoder like this online translit converter.

Discussion

People that use this tip, could you please provide feedback? Thanks, Riny


tips/transliteration.txt · Last modified: 2007-12-12 10:50 by 87.234.151.229

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki