Table of Contents
- Page tokenizing (aka. splitting the text into separate words)
- handle Asian characters as words
This event is signalled by tokenizer() in inc/indexer.php when a page or search term is about to be split into words, handlers can use it to modify the behaviour how words are detected. The default action uses a regular expression to separate Asian characters into single words.
If you intercept this event you should also add your plugin to the index version through using the INDEXER_VERSION_GET event.
$data contains a string before being split into words. The source of the string will be the text of a page, or an individual term of a search query. Your plugin should modify the text in a way that words are separated by spaces or newlines.
- Code related to this event used in any DokuWiki's files, plugins and templates
devel/event/indexer_text_prepare.txt · Last modified: 2018-12-08 15:41 by torpedo