This is an old revision of the document!

xhtmlruby plugin

Compatible with DokuWiki

2009-02-14b (not tested on earlier versions)

Converts Japanese furigana written as 漢字(ふり) into XHTML 1.1 <ruby><rb>漢字</rb><rp>(</rp><rt>ふり</rt><rp>)</rp></ruby> markup

Last updated on: 2010-05-01
Provides: Action

The missing download url means that this extension cannot be installed via the Extension Manager. Please see Publishing a Plugin on dokuwiki.org. Recommended are public repository hosts like GitHub, GitLab or Bitbucket.

This extension has not been updated in over 2 years. It may no longer be maintained or supported and may have compatibility issues.

Similar to ruby

Tagged with furigana, ruby

By Mike "Pomax" Kamermans

Download and Installation

Download and install the plugin using the Plugin Manager using the following URL.

tar.gz format (3k)

To install the plugin manually, download the source to your plugin folder, lib/plugins, and extract its contents. That will create a new plugin folder, lib/plugins/xhtmlruby, containing four file:

style.css - CSS styling for the ruby markup
conf.ini - an ini file for setting whether to parse wiki text or not, and TOC text or not.
script.js - a script that ensures the CSS styling is correct for the browser that's loading the page
action.php - the plugin

The plugin is now installed.

Important installation note

DokuWiki is itself XHTML 1.0 compliant, but the ruby element was not admitted to XHTML until 1.1 - this means that if you want your DokuWiki to pass w3c validation, you will need to change the header signature that DokuWiki generates in the inc/actions.php file.

Change lines 485 and 486 from:

$pre .= '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"' . DOKU_LF;
$pre .= ' "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">' . DOKU_LF;

to:

$pre .= '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"' . DOKU_LF;
$pre .= ' "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">' . DOKU_LF;

and w3c validation should now pass all ruby elements without complaint, rather than generating a massive error and warning for each and every one of them.

Getting your guide text marked up as ruby

This plugin will convert Japanese kanji with furigana in parentheses into XHTML 1.1 ruby element markup, meaning that

漢字(ふり)

in your wiki text (including headings) will be automatically converted to:

<ruby><rb>漢字</rb><rp>(</rp><rt>ふり</rt><rp>)</rp></ruby>

when the page is rendered. While it should be obvious, this plugin does not modify your wiki text in any way, it merely does the substitution when the page gets rendered.

The plugin also allows for 'empty' guide text, to force guide text to be over a specific subset of characters:

駅()前(まえ)

will generate:

駅<ruby><rb>前</rb><rp>(</rp><rt>まえ</rt><rp>)</rp></ruby>

rather than:

<ruby><rb>駅前</rb><rp>(</rp><rt>まえ</rt><rp>)</rp></ruby>

Details

Because ruby markup must be set up not just for the page text, but also the text in headings and the TOC, this is an XHTML post-processing action plugin, rather than a syntax plugin.

The replacement is based on a regular expression search and replace:

kanji = "[\x{4E00}-\x{9FFF}\x{3005}\x{30F6}]+";
kana = "[\x{3040}-\x{30FF}]+";
search = "/(".kanji.")\((".kana.")\)/u";
replace = "<ruby><rb>$1</rb><rp>(</rp><rt>$2</rt><rp>)</rp></ruby>";

“kanji” cover the “CJK Unified Ideograph” Unicode block, plus the Unicode glyphs 々 (kanji repetition) and ヶ (simplified form of 箇), “kana” covers the “Hiragana” and “Katakana” Unicode blocks.

Todo

Extend the plugin so that Chinese “bopomofo” phonetic guide text, as well as Korean Hangul are parsed as readings, and the CJK Unified Ideograph extensions blocks A and B are considered legal kanji forms, too.

Bugs

None known at the time of writing.

Source

This plugin consists of four files:

style.css - CSS styling for the ruby markup
conf.ini - an ini file for setting whether to parse wiki text or not, and TOC text or not.
script.js - a script that ensures the CSS styling is correct for the browser that's loading the page
action.php - the plugin

style.css

/* ----------------  Ruby markup  ----------------- */
 
ruby {
	display: inline-table;
	text-align: center;
	vertical-align: bottom;
}
 
rb {
	display: table-row-group;
}
 
rt {
	display: table-header-group;
	font-size: 60%;
}
 
rp {
	display: none;
}

conf.ini

;
; This is the configuration file for the xhtmlruby action plugin
;
[config]

; if 'true', wiki text will be rubified. if 'false', it won't be (includes headers)
parse_wiki_text = true

; if 'true', text in the TOC will be rubified. if 'false', it won't be.
parse_toc_text = false

script.js

/**
 *
 * XHTML 1.1 Ruby markup suffers from the "browsers don't always bother to obey CSS" problem.
 * The standard way to visualise ruby is by making the ruby code an inline table, and bottom
 * aligning it. However, not all browsers understand "bottom". or "baseline". This javascript
 * will try to make sure the ruby placement is correct for all major browsers by detecting the
 * browser, and modifying the ruby CSS rules accordingly.
 *
 * - Mike "Pomax" Kamermans
 */
 
// ----------------------------------------------------------------------------------------------------------------------
// CSS MANIPULATION
//
// based on http://www.hunlock.com/blogs/Totally_Pwn_CSS_with_Javascript
// ----------------------------------------------------------------------------------------------------------------------
 
function getCSSRule(ruleName, deleteFlag) {
	ruleName=ruleName.toLowerCase();
	if (document.styleSheets) {
		for (var i=0; i<document.styleSheets.length; i++) {
			var styleSheet=document.styleSheets[i];
			var ii=0;
			var cssRule=false;
			do {
				if (styleSheet.cssRules) { cssRule = styleSheet.cssRules[ii]; }
				else { cssRule = styleSheet.rules[ii]; }
				if (cssRule)  {
					if (cssRule.selectorText.toLowerCase()==ruleName) {
						if (deleteFlag=='delete') {
							if (styleSheet.cssRules) { styleSheet.deleteRule(ii); }
							else { styleSheet.removeRule(ii); }
							return true; }
						else { return cssRule; }}}
				ii++;
			}
			while (cssRule) }}
	return false;}
 
function killCSSRule(ruleName) { return getCSSRule(ruleName,'delete'); }
 
function addCSSRule(ruleName) {
	if (document.styleSheets) {
		if (!getCSSRule(ruleName)) {
			if (document.styleSheets[0].addRule) { document.styleSheets[0].addRule(ruleName, null,0); }
			else { document.styleSheets[0].insertRule(ruleName+' { }', 0); }}}
	return getCSSRule(ruleName); }
 
// ----------------------------------------------------------------------------------------------------------------------
// Ruby alignment code
// ----------------------------------------------------------------------------------------------------------------------
 
/**
 * Check which browser render engine we're dealing with
 */
function getBrowser()
{
	var opera="opera"; var ie="ie"; var gecko="gecko"; var chrome = "chrome"; var browser="unknown";
	if (window.opera) { browser = opera; }
	else if (Array.every) { browser = gecko; }
	else if (document.all) { browser = ie; }
	else if (window.chrome) { browser = chrome; }
	return browser;
}
 
/**
 * Different browsers (of fucking course) do different things when using named vertical alignments.
 * So, even though it's highly undesirable, browser-dependent corrections.
 */
function fixRubyAlignment()
{
	// Webkit and Gecko browsers align properly on "bottom", but other browsers do not...
	var browser = getBrowser();
	var rubyrule = getCSSRule("ruby");
 
	// if we're rendering for IE, then annoyingly 'bottom' doesn't align properly. However, we can use 'baseline' instead, and all is well
	if(browser=="ie") { rubyrule.style.verticalAlign = "baseline";	}	
	// Opera (9.5x) is even more annoying. Neither "bottom" nor "baseline" does what it's supposed to do, so we're left with value (em) manipulation instead.
	else if(browser=="opera") { rubyrule.style.verticalAlign = "1.3em"; }
 
	 // Chrome 4.x doesn't support ruby unless IT gets to call the CSS shots
	else if(browser=="chrome") {
		killCSSRule('rp');
		killCSSRule('rt');
		killCSSRule('rb');
		killCSSRule('ruby'); }
	// if we don't know what browser this is, assume "bottom" works. If it doesn't, their fault.
	else { rubyrule.style.verticalAlign = "bottom"; }
}
 
 
// ----------------------------------------------------------------------------------------------------------------------
// DokuWiki code
// ----------------------------------------------------------------------------------------------------------------------
 
/**
 * lets dokuwiki schedule the javascript call
 */
addInitEvent(function(){ fixRubyAlignment(); });

action.php

<?php
/**
 * action plugin for simplified XHTML Ruby notation
 * (full page post processing required due to ruby in headers and ToC)
 *
 * @license	GPLv3 (http://www.gnu.org/licenses/gpl.html)
 * @link	   http://www.dokuwiki.org/plugin:xhtmlruby
 * @author	 Mike "Pomax" Kamermans <pomax@nihongoresources.com>
 */
 
if(!defined('DOKU_INC')) die();
if(!defined('DOKU_PLUGIN')) define('DOKU_PLUGIN',DOKU_INC.'lib/plugins/');
require_once(DOKU_PLUGIN.'action.php');
 
class action_plugin_xhtmlruby extends DokuWiki_Action_Plugin {
 
	var $version = '2009-10-26';
 
	// configurable options
	var $parse_wiki_text = false;
	var $parse_toc_text = false;
 
	// for now, we operate on Japanese only - TODO: bopomofo and hangul rubification
	var $re_kanji = "[\x{4E00}-\x{9FFF}\x{3005}\x{30F6}]+";
	var $re_kana = "[\x{3040}-\x{30FF}]*";
 
	// s/// patterns
	var $re_search = "";
	var $re_replace = "";
 
	function getInfo() {
	  return array(
		'author' => 'Mike "Pomax" Kamermans',
		'email'  => 'pomax@nihongoresources.com',
		'date'   => $this->version,
		'name'   => 'xhtmlruby',
		'desc'   => 'Converts Japanese 漢字(ふり) into xhtml 1.1 <ruby><rb>漢字</rb><rp>(</rp><rt>ふり</rt><rp>)</rp></ruby> markup',
		'url'	=> 'n/a');
	}
 
	/**
	 * Postprocesses the HTML that was built from that, to rubify kanji that have associated furigana.
	 */
	function register(&$controller) 
	{
		// initialise variables
		$this->re_search = "/(".$this->re_kanji.")\((".$this->re_kana.")\)/u";
		$this->re_replace = "<ruby><rb>$1</rb><rp>(</rp><rt>$2</rt><rp>)</rp></ruby>";
 
		// initialise ini variables
		$inivars = parse_ini_file(DOKU_INC.'lib/plugins/xhtmlruby/conf.ini');
		if($inivars['parse_toc_text']==true) { $this->parse_toc_text = true; }
		if($inivars['parse_wiki_text']==true) { $this->parse_wiki_text = true; }
 
		// uses a custom hook that needs to be added in html.php, see documentation
		if($this->parse_toc_text===true) { 
			$controller->register_hook('HTML_TOC_ITEM', 'AFTER', $this, '_rubify_tocitem'); }
 
		if($this->parse_wiki_text===true) {
			$controller->register_hook('RENDERER_CONTENT_POSTPROCESS', 'AFTER', $this, '_rubify'); }
	}
 
	/**
	 * rubify for ToC items
	 */
	function _rubify_tocitem(&$event, $param)
	{
		$item = &$event->data;
		$item = preg_replace($this->re_search,$this->re_replace,$item);
	}
 
	/**
	 * rubify for wiki text
	 */
	function _rubify(&$event, $param)
	{
		// reference to data and associated data type
		$data = &$event->data[1];
		$datatype = &$event->data[0];
 
		// do nothing if the data is not not XHTML (this only generates XHTML ruby markup)
		if ($datatype != 'xhtml')  { return; }
 
		// and finally, perform the postprocessing 'en place'
		$data = preg_replace($this->re_search,$this->re_replace,$data);
	}
}
?>

Discussion

Thank you thank you so much for this fantastic plugin! I searched for a DokuWiki “ruby” plugin thinking it would be so obscure a need I'd never find anything, but was delighted to find something that works EXACTLY like I need it to (in Internet Explorer 6.0 no less!).

I do have one question though, and it's regarding support in Chrome and possibly other browsers. These adhere to HTML5 standards for ruby tags, and therefore the ruby texts, as well as chunks of the page, don't display correctly. Would it be very complicated to add a conditional <ruby>Text<rt>テキスト</rt></ruby> HTML5 style tag in the case of browsers that don't support the XHTML standard? Once again, thanks for this awesome plug-in! — kououken 2010/02/19 16:08

Actually, the reason it malrenders in Chrome is because for a while now it's been using a version of webkit that messes up ruby code. If there is no stylesheet CSS rule for the ruby element, things looks fine, but if there is, good chance the ruby markup magically disappears (see here for a demonstrator of this behaviour). This has been filed as bug for webkit, and has been patched, but Chrome has to date (being at public version 4.1.249.1064) not updated to include this webkit patch.

I've updated the javascript responsible for massaging the CSS based on browsers, so that it actually dynamically removes all css ruby rule when it sees Chrome is being used. This seems to be the only working fix at the moment (hopefully the damn Chrome team reads my @#%! bug reports. The webkit people responded immediately, and fixed the issue in a few hours, the Chrome team hasn't reacted even once yet to my weeks an weeks of filing bug reports). — Pomax 2010/05/01 00:15

Table of Contents