DokuWiki

It's better when it's simple

User Tools

Site Tools


devel:syntax_plugins

This is an old revision of the document!


Syntax Plugins

FIXME(this needs to be refactored)

Syntax Plugins are plugins to extend DokuWiki's syntax. To be able to understand what is needed to register new Syntax within DokuWiki you should read how the Parser works.

Synopsis

A Syntax Plugin example needs to define a class named syntax_plugin_example which extends DokuWiki_Syntax_Plugin1). The class needs to be stored in a file called lib/plugins/example/syntax.php. For full details of plugins and their files refer to plugin file structure.

The class needs to implement at least the following functions:

  • getInfo() Return a Hash with plugin info [author, email, date, name, desc, url]
  • getType() Should return the type of syntax this plugin defines (see below)
  • getSort() Returns a number used to determine in which order modes are added, also see parser, order of adding modes and getSort list.
  • connectTo($mode) This function is inherited from Doku_Parser_Mode 2). Here is the place to register the regular expressions needed to match your syntax.
  • handle($match, $state, $pos, &$handler) to prepare the matched syntax for use in the renderer
  • render($mode, &$renderer, $data) to render the content


The following additional methods can be overridden when required:

  • getPType() Defines how this syntax is handled regarding paragraphs3). Return:
    • normal — (default value, will be used if the method is not overridden) The plugin can be used inside paragraphs,
    • block — Open paragraphs need to be closed before plugin output or
    • stack — Special case. Plugin wraps other paragraphs
  • getAllowedTypes() (default value: array()) Should return an array of mode types that may be nested within the plugin's own markup.
  • accepts($mode) This function is used to tell the parser if the plugin accepts syntax mode $mode within its own markup. The default behaviour is to test $mode against the array of modes held by the inherited property allowedModes.

Additional functions can be defined as needed. It is recommended to prepend an underscore to self defined functions to avoid possible nameclashes with future plugin specification enhancements.


Inherited Properties

  • allowedModes — initial value, an empty array, inherited from Doku_Parser_Mode 4). Contains a list of other syntax modes which are allowed to occur within the plugin's own syntax mode (ie. the modes which belong to any other DokuWiki markup that can be nested inside the plugin's own markup). Normally, it is automatically populated by the accepts() function using the results of getAllowedTypes().

Syntax Types

DokuWiki uses different syntax types to determine which syntax may be nested. Eg. you can have text formatting inside of tables. To integrate your plugin into this system it needs to specify which type it is and which types can be nested within it. The following types are currently available:

Type Used in … Description
container listblock, table, quote, hr containers are complex modes that can contain many other modes – hr breaks the principle but they shouldn't be used in tables / lists so they are put here
baseonly header some modes are allowed inside the base mode only
formatting strong, emphasis, underline, monospace, subscript, superscript, deleted, footnote modes for styling text – footnote behaves similar to styling
substition5) 'acronym', 'smiley', 'wordblock', 'entity', 'camelcaselink', 'internallink', 'media', 'externallink', 'linebreak', 'emaillink', 'windowssharelink', 'filelink', 'notoc', 'nocache', 'multiplyentity', 'quotes', 'rss' modes where the token is simply replaced – they can not contain any other modes
protected 'preformatted', 'code', 'file', 'php', 'html'modes which have a start and end token but inside which no other modes should be applied
disabled unformatted inside this mode no wiki markup should be applied but lineendings and whitespace isn't preserved
paragraphs eol used to mark paragraph boundaries

For a description what each type means and which other formatting classes are registered in them read the comments in inc/parser/parser.php.

Tutorial: Syntax Plugins Explained

The goal of this tutorial is to explain the concepts involved in a DokuWiki syntax plugin and to go through the steps involved in writing your own plugin.

For those who are really impatient to get started, grab a copy of the syntax plugin skeleton. It's a bare bones plugin which outputs “Hello World!” when it encounters “<TEST>” on a wiki page.

Quick Summary

modes

  • each individual piece of DokuWiki syntax, including your plugin, has its own mode.
  • similar modes are grouped together into mode types.
  • a mode's “allowedTypes” govern which other DokuWiki syntax is recognised when nested within the mode's own syntax. All the modes which belong to the allowedTypes will be permitted.
  • a mode's “type” lets other modes know if they can permit this mode within their syntax.

handle

  • the handle() method is called when the parser encounters wiki page content that it decides belongs to your syntax mode.
  • the $state parameter says which type of pattern registered to your mode was triggered. If it's just ordinary text the state parameter will be set to DOKU_LEXER_UNMATCHED
  • do as much processing and decision making as possible here, leaving as little as possible to be carried out in the render() method because the output of handle is cached. This also means that you shouldn't do any stuff here that mustn't be cached.

render

  • The render() method processes the renderer instructions that apply to the plugin's syntax mode - and which were created by the plugin's handle() method.
  • add content to the output document with $renderer->doc .= 'content';
  • ensure any content output by the plugin is safe - run raw wiki data through an entity conversion function.
  • do the minimum possible processing and decision making here, it should all have been done in the handle() method.

:!: There is no guarantee the render() method will be called at the same time as the handle() method. The instructions generated by the handler are cached and can be used by the renderer at a future time. The only sure way to pass data from handle() to render() is using the array it returns - which is passed to render() as the $data parameter.

Key Concepts

modes

Modes (or more properly syntax modes) are the foundation on which the DokuWiki parser is based. Every different bit of DokuWiki markup has its own syntax mode. E.g. there is a strong mode for handling strong, a superscript mode for handling superscript, a table mode for processing tables and many more.

When the parser encounters some markup it enters the syntax mode for that markup. The properties and methods of that particular syntax mode govern how the parser behaves while it is within that mode, including:

  • what other syntax modes are allowed to occur
  • what instructions to prepare for the renderer

Your plugin will add its own syntax mode to the parser - that is automatically handled by DokuWiki when the plugin is first loaded, the name assigned is plugin_+ the name of the plugin's directory (which must also be the plugin's class name without the prefix “syntax_”). Then, when the parser encounters the markup used for your plugin, the parser will enter into that syntax mode. While it is in that mode your plugin controls what the parser can do.

mode types

To simplify things, syntax modes which behave in a similar manner have been grouped together into several mode types - a complete list can be found on the syntax plugin page.

Each mode type corresponds to a key in the $PARSER_MODES array. The entry for each mode type is itself an array which holds all the syntax modes which belong to that type. e.g. In vanilla DokuWiki with no plugins installed, $PARSER_MODES['formatting'] holds an array containing: 'strong', 'emphasis', 'underline', 'superscript', 'subscript', 'monospace', 'deleted' & 'footnote'.

When each plugin is loaded into the parser it is queried, via getType(), to discover which mode type it will belong to. The syntax mode associated with the plugin is then added to the appropriate $PARSER_MODES array.

:!: The mode type your plugin reports governs where in a DokuWiki page the parser will recognise your plugin's markup. Other DokuWiki (and plugin) syntax modes won't know about your plugin, but they do know about the different mode types. If they allow a particular mode type, they will allow all the modes which belong to that type, including any plugins that have returned that mode type.

Select the mode type for your plugin by comparing the behaviour of your plugin to that of the standard DokuWiki syntax modes. Choose the type that the most similar modes belong to.

allowed modes

These are the other modes that can occur nested within the current mode's own markup.

Each syntax mode has its own array of allowed modes which tells the parser what other syntax modes will be recognised whilst its processing the mode. That is, if you want your plugin to be able to occur nested within “**strong**” markup, then the strong mode must include your plugin's mode in its allowedModes array. And if you want to allow strong markup nested within your plugin's markup then your plugin must have 'strong' in its allowModes array.

:!: Your plugin gets in the allowedModes array of other syntax modes through the mode type it reports using the getType() method.

:!: Your plugin tells the parser which other syntax modes it permits by reporting the mode types it allows via the getAllowedTypes() method.

PType

PType governs how the parser handles html <p> elements when dealing with your syntax mode.

Generally, when the parser encounters some markup, there will be a currently open HTML paragraph tag. The parser needs to know if it should close that tag before entering your syntax mode and then open another paragraph when exiting, that is PType='block', or whether it should leave the paragraphs alone, PType='normal'. There is a third option, PType='stack', which I don't fully understand so I'll leave that for now (FIXME).

For those that know CSS, returning PType='block' means the html generated by your plugin will be similar to display:block and returning PType='normal'means the HTML generated will be similar to display:inline.

There is one gotcha with PType='block'. If your plugin allows other syntax modes, the parser will generate </p> & <p> tags when entering and exiting any nested syntax modes. If that causes problems, choose PType='normal' and start the HTML your render method generates with a </p> and finish it with a <p>. — [corrected in DW2006-11-06 version (and preceding release candidates)]

Sort Number

This number is used by the lexer6) to control the order it tests the syntax mode patterns against raw wiki data. It is only important if the patterns belonging two or more modes match the same raw data - where the pattern belonging to the mode with the lowest sort number will win out.

You can make use of this behaviour to write a plugin which will replace or extend a native DokuWiki handler for the same syntax. An example is the code plugin.

Details of existing sort numbers are available for both the parser (sort list).

Patterns

The parser uses PHP's preg7) compatible functions. A detailed explanation of regular expressions and their syntax is beyond the scope of this tutorial. There are many good sources on the web.

The complete preg syntax is not available for use in constructing syntax plugin patterns. Below is a list of the known differences:

  • don't surround the pattern with delimiters
  • to use a pipe “|” for multiple alternatives, make them a non-captured group, e.g. “(?:cat|dog)
  • be very wary of look behind assertions. The parser only attempts to match patterns on the next piece of “not yet matched” data. If you need to look behind to characters that have been involved in a previous pattern match, those characters will never be there.
  • option flags can only be included as inline options, e.g. (?i), (?-i)


The parser provides four functions for a plugin to register the patterns it needs. Each function corresponds to a pattern with a different meaning.

  • special patternsaddSpecialPattern() — these are the patterns used when one pattern is all that is required. In the parser's terms, these patterns represent entry in the the plugin's syntax mode and exit from that syntax mode all in the one match. Typically these are used by substition plugins.
  • entry patternsaddEntryPattern() — the pattern which indicates the start of data to be handled by the plugin. Typically these patterns should include a look-ahead to ensure there is also an exit pattern. Any plugin which registers an entry pattern should also register an exit pattern.
  • exit patternsaddExitPattern() — the pattern which indicates the end of the data to be handled by the plugin. This pattern can only be matched if text matching the entry pattern has been found.
  • internal patternsaddPattern() — these represent special syntax applicable to the plugin that may occur between the entry and exit patterns. Generally these are only required by the more complex structures, e.g. lists and tables.


One plugin may add several patterns to the parser, including more than one pattern of the same type.

Tips

  • use non-greedy quantifiers, e.g. +? or *? instead of + or *.
  • be wary of using multiple exit patterns. The first exit pattern encountered will most likely trigger the parser to exit your syntax mode - even if that wasn't the pattern the entry pattern looked ahead for. Needing multiple exit patterns probably indicates a need for multiple plugins.
  • early versions of the DokuWiki lexer had a bug which prevented use of “<” or “>” in look ahead patterns. This bug has been fixed and angle brackets can now be used. Some plugins will still contain the hex codes for angle brackets (“\x3C”, “\x3E”) which was the workaround to overcome the effects of this bug.

handle() method

This is the part of your plugin which should do all the work. Before DokuWiki renders the wiki page it creates a list of instructions for the renderer. The plugin's handle() method generates the render instructions for the plugin's own syntax mode. At some later time, these will be interpreted by the plugin's render() method. The instruction list is cached and can be used many times, making it sensible to maximize the work done once by this function and minimize the work done many times by render().

$match parameter — The text matched by the patterns, or in the case of DOKU_LEXER_UNMATCHED the contiguous piece of ordinary text which didn't match any pattern.

$state parameter — The lexer state for the match, representing the type of pattern which triggered this call to handle():

  • DOKU_LEXER_ENTER — a pattern set by addEntryPattern()
  • DOKU_LEXER_MATCHED — a pattern set by addPattern()
  • DOKU_LEXER_EXIT — a pattern set by addExitPattern()
  • DOKU_LEXER_SPECIAL — a pattern set by addSpecialPattern()
  • DOKU_LEXER_UNMATCHED — ordinary text encountered within the plugin's syntax mode which doesn't match any pattern.

$pos parameter — The character position of the matched text.

&$handler parameter — Object Reference to the Doku_Handler object.

render() method

The part of the plugin that provides the output for the final web page - or whatever other output format is supported. It is here that the plugin adds its output to that already generated by other parts of the renderer - by concatenating its output to the renderer's doc property. e.g.

$renderer->doc .= "some plugin output...";

:!: Any raw wiki data that passes through render() should have all special characters converted to HTML entities. You can use the PHP functions, htmlspecialchars(), htmlentities() or the renderer's own xmlEntities() method. e.g.

$renderer->doc .= $renderer->_xmlEntities($text);

$mode parameter — Name for the format mode of the final output produced by the renderer. At present DokuWiki only supports one output format - XHTML 8). New modes can be introduced by renderer plugins. The plugin should only produce output for those formats which it supports - which means this function should be structured …

if ($mode == 'xhtml') {  // supported mode
  // code to generate XHTML output from instruction $data
}

$data parameter — An array containing the instructions previously prepared by the plugin's own handle() method. This function must interpret the instruction and generate the appropriate output.

Safety & Security

Raw wiki page data which reaches your plugin has not been processed at all. No further processing is done on the output after it leaves your plugin. At an absolute minimum the plugin should ensure any raw data output has all HTML special characters converted to HTML entities. Also any wiki data extracted and used internally should be treated with suspicion. See also security.

Localization

FIXME

For now refer to localisation & plugin file structure

Configuration

Please refer to configuration.

Using Styles and JavaScript

FIXME

For now refer to plugin file structure

Adding a Toolbar Button

To make it easy on the users of wikis which install your plugin, you should add a button for its syntax to the editor toolbar.

See the Action plugin page, sample_action_plugin_2. Also refer to toolbar.

Writing Your Own Plugin

Ok, so you have decided you want to extend DokuWiki's syntax with your own plugin. You have worked out what that syntax will be and how it should be rendered on the user's browser. Now you need to write the plugin.

  1. Decide on a name for the plugin. You may want to check the list of available plugins to make sure you aren't choosing a name that is already in use.
  2. In your own DokuWiki installation, create a new sub directory in the lib/plugins/ directory. That directory will have the same name as your plugin.
  3. Create the file syntax.php in the new directory. As a starting point, use a copy of the skeleton plugin.
  4. Edit that file to make it yours.
    • change the class name to be syntax_plugin_<your plugin name>9).
    • change the getInfo() to report information about your plugin.
    • change the getType() method to report the mode type your plugin will belong to.
    • add a getAllowedTypes() method to report any mode types your plugin will allow to be nested within its own syntax. If your plugin won't allow any other mode then this can be left out.
    • change the getPType() method to report the PType that will apply for your plugin. If its 'normal' you can remove this method.
    • change the getSort() method to report a unique number after checking the list of plugins
    • alter the connectTo() method to register the pattern to match your syntax.
    • add a postConnect() method if your syntax has an second pattern to say when the parser is leaving your syntax mode.
  5. That's the easy part done, you now have a plugin that will say “Hello World!” when it encounters your syntax pattern. Time to test it and make sure the pattern works as expected - visit your wiki and make up a page with the syntax for your plugin, save it and make sure “Hello World!” shows up.
  6. Write your handle() & render() methods.
    • if you have entry and exit patterns remember to handle the unmatched data.
    • treat raw wiki data with suspicion and remember to ensure all special characters go to an entity converter.
  7. Test and post your completed plugin on the DokuWiki plugin page.

Sample Plugin 1 - Now

When its syntax, [NOW], is encountered in a wiki page the current date and time will be inserted in RFC2822 format.

  • type is 'substition'. We are substituting a time stamp for the [NOW] token, similar to the way smileys and acronyms are handled. They belong to the mode type 'substition' so we will too.
  • allowedTypes are not required, no other DokuWiki syntax can occur within our [NOW] syntax. Therefore we don't need the getAllowedTypes() method.
  • PType is normal, that's the default value, so we don't need the getPType() method.
  • there is no need for an entry and exit pattern, just a special pattern to detect [NOW]. The only thing we need to be careful of is “[” and “]” have special meanings in regular expressions, so we will need to escape them, making our pattern - '\[NOW\]'.
  • in this case the handler() method doesn't need to do anything. We have no special states to take care of or extra parameters in our syntax. We just return an empty array to ensure a render instruction for our plugin is stored.
  • all the render() method needs to do is add the time stamp to the current wiki page — $renderer->doc .= date('r');

And that's our plugin finished.

syntax.php
<?php
/**
 * Plugin Now: Inserts a timestamp.
 * 
 * @license    GPL 2 (http://www.gnu.org/licenses/gpl.html)
 * @author     Christopher Smith <chris@jalakai.co.uk>
 */
 
// must be run within DokuWiki
if(!defined('DOKU_INC')) die();
 
if(!defined('DOKU_PLUGIN')) define('DOKU_PLUGIN',DOKU_INC.'lib/plugins/');
require_once DOKU_PLUGIN.'syntax.php';
 
/**
 * All DokuWiki plugins to extend the parser/rendering mechanism
 * need to inherit from this class
 */
class syntax_plugin_now extends DokuWiki_Syntax_Plugin {
 
    function getInfo() {
        return array('author' => 'me',
                     'email'  => 'me@someplace.com',
                     'date'   => '2005-07-28',
                     'name'   => 'Now Plugin',
                     'desc'   => 'Include the current date and time',
                     'url'    => 'http://www.dokuwiki.org/plugin:tutorial');
    }
 
    function getType() { return 'substition'; }
    function getSort() { return 32; }
 
    function connectTo($mode) {
        $this->Lexer->addSpecialPattern('\[NOW\]',$mode,'plugin_now');
    }
 
    function handle($match, $state, $pos, &$handler) {
        return array($match, $state, $pos);
    }
 
    function render($mode, &$renderer, $data) {
        if($mode == 'xhtml'){
            $renderer->doc .= date('r');
            return true;
        }
        return false;
    }
}

Note: due to the way DokuWiki caches pages this plugin will report the date/time at which the cached version was created. You would need to add ~~NOCACHE~~ to the page to ensure the date was current every time the page was requested.

Sample Plugin 2 - Color

When its syntax, <color somecolour/somebackgroundcolour>, is encountered in a wiki page the text colour will be changed to somecolour, the background will be changed to somebackgroundcolour and both will remain that way until </color> is encountered.

  • what we are doing is similar to the strong mode, its type is 'formatting' so we should use that type too.
  • allowedTypes should be the inline modes - substition, formatting & disabled.
  • PType is normal, that's the default value, so again we don't need a getPType() method.
  • we need to use an entry and exit pattern. The entry pattern should check to make sure there is an exit pattern, which means '<color.*>(?=.*?</color>)'. The exit pattern is simpler, </color>.
  • the handle() method will need to deal with three states matching our entry and exit patterns and unmatched for the text which occurs between them.
    • DOKU_LEXER_ENTER state requires some processing to extract the colour and background colour values, they make up our render instruction.
    • DOKU_LEXER_UNMATCHED state doesn't require any processing, but we have to pass the unmatched text (in $match) to render() so that goes into our render instruction.
    • DOKU_LEXER_EXIT state doesn't require any processing or have any special data, we simply need to generate an exit instruction for render().
  • the render() method will need to deal with the same three states as handle().
    • DOKU_LEXER_ENTER, open a span with a style using the colour and/or background colour values.
    • DOKU_LEXER_UNMATCHED, add the unmatched text to the output document.
    • DOKU_LEXER_EXIT, close the span

Again, all fairly straightforward - and here it is.

<?php
/**
 * Plugin Color: Sets new colors for text and background.
 * 
 * @license    GPL 2 (http://www.gnu.org/licenses/gpl.html)
 * @author     Christopher Smith <chris@jalakai.co.uk>
 */
 
// must be run within Dokuwiki
if(!defined('DOKU_INC')) die();
 
if(!defined('DOKU_PLUGIN')) define('DOKU_PLUGIN',DOKU_INC.'lib/plugins/');
require_once(DOKU_PLUGIN.'syntax.php');
 
/**
 * All DokuWiki plugins to extend the parser/rendering mechanism
 * need to inherit from this class
 */
class syntax_plugin_color extends DokuWiki_Syntax_Plugin {
 
    /**
     * return some info
     */
    function getInfo(){
        return array(
            'author' => 'Christopher Smith',
            'email'  => 'chris@jalakai.co.uk',
            'date'   => '2008-02-06',
            'name'   => 'Color Plugin',
            'desc'   => 'Changes text colour and background',
            'url'    => 'http://www.dokuwiki.org/plugin:tutorial',
        );
    }
 
    function getType(){ return 'formatting'; }
    function getAllowedTypes() { return array('formatting', 'substition', 'disabled'); }   
    function getSort(){ return 158; }
    function connectTo($mode) { $this->Lexer->addEntryPattern('<color.*?>(?=.*?</color>)',$mode,'plugin_color'); }
    function postConnect() { $this->Lexer->addExitPattern('</color>','plugin_color'); }
 
 
    /**
     * Handle the match
     */
    function handle($match, $state, $pos, &$handler){
        switch ($state) {
          case DOKU_LEXER_ENTER :
                list($color, $background) = preg_split("/\//u", substr($match, 6, -1), 2);
                if ($color = $this->_isValid($color)) $color = "color:$color;";
                if ($background = $this->_isValid($background)) $background = "background-color:$background;";
                return array($state, array($color, $background));
 
          case DOKU_LEXER_UNMATCHED :  return array($state, $match);
          case DOKU_LEXER_EXIT :       return array($state, '');
        }
        return array();
    }
 
    /**
     * Create output
     */
    function render($mode, &$renderer, $data) {
        if($mode == 'xhtml'){
            list($state,$match) = $data;
            switch ($state) {
              case DOKU_LEXER_ENTER :      
                list($color, $background) = $match;
                $renderer->doc .= "<span style='$color $background'>"; 
                break;
 
              case DOKU_LEXER_UNMATCHED :  $renderer->doc .= $renderer->_xmlEntities($match); break;
              case DOKU_LEXER_EXIT :       $renderer->doc .= "</span>"; break;
            }
            return true;
        }
        return false;
    }
 
    // validate color value $c
    // this is cut price validation - only to ensure the basic format is correct and there is nothing harmful
    // three basic formats  "colorname", "#fff[fff]", "rgb(255[%],255[%],255[%])"
    function _isValid($c) {
        $c = trim($c);
 
        $pattern = "/^\s*(
            ([a-zA-z]+)|                                #colorname - not verified
            (\#([0-9a-fA-F]{3}|[0-9a-fA-F]{6}))|        #colorvalue
            (rgb\(([0-9]{1,3}%?,){2}[0-9]{1,3}%?\))     #rgb triplet
            )\s*$/x";
 
        if (preg_match($pattern, $c)) return trim($c);
 
        return "";
    }
}
?>

Note: No checking is done to ensure colour names are valid or RGB values are within correct ranges.

1)
defined in lib/plugins/syntax.php
2) , 4)
defined in inc/parser/parser.php
3)
See Doku_Handler_Block
5)
Yes this is spelled wrong, but we won't change it to avoid breaking existing plugins. Sometimes a typo becomes a standard - see the HTTP “referer” header for an example
6)
the part of the parser which analyses the raw wiki page
7)
perl compatible regular expressions
ref: www.php.net/manual/en/ref.pcre.php
8)
There is also the special mode metadata that doesn't output anything but collects metadata for the page. Use it to insert values into the metadata array. See the translation plugin for an example.
9)
The name may not contain underscores and needs to match your class name
devel/syntax_plugins.1259175914.txt.gz · Last modified: 2009-11-25 20:05 by 204.58.246.49

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki