DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:searchpattern

SearchPattern Plugin

Compatible with DokuWiki

  • 2013-12-08 "Binky" unknown
  • 2013-05-10 "Weatherwax" yes
  • 2012-10-13 "Adora Belle" yes
  • 2012-01-25 "Angua" unknown

plugin Search a specified pattern inside wiki pages (standard or regex) and display the results as a table inside a wiki page

Provides
Syntax, Admin

Tagged with pattern, search, searchpattern

Information

Ver. 2013-06-16c of SearchPattern plugin has been released. Now the regex mode has a new option to output the specified or all matches and the plugin can call a handler method of other plugins for outputting the matches. A new admin option 'dispheadl' allows you to hide the table headline with search options (e.g. regex code) in result table. Bugfixed french translation and bad-option settings.
This is used in the new version 2013-04-13 of the Todo Plugin.
Update Instructions: The version 2013-06-16b is fully compatible to the previous versions, so an update to newest version should have no impact to older functionality (an update needs no additional configuration steps).

Ver. 0.2 of SearchPattern plugin has been released. You are now able to restrict the search area to selectable pages/namespaces. All feedback is very welcome in the dedicated section.

Ver. 0.2a of SearchPattern plugin has been released. It is a bug fix release based on ver. 0.2. It doesn't include Leo's changes.

A bug exists in all versions till 2013-04-15 that will make the “admin” menu to be empty if you're using the french translation.
To fix it, in lang/fr/lang.php, in line 15, replace :
$lang['ndqerr_admin_title'] = 'En cas d\'erreur sur des apostrophes non doublécute;es' :;
by :
$lang['ndqerr_admin_title'] = 'En cas d\'erreur sur des apostrophes non doublécute;es :';

Thanks to Frédéric for reporting it — fixed in version 2013-06-16 leiblerleibler

2013/06/16 10:27

Download and Installation

Download and install the plugin using the Plugin Manager using the following URL. Refer to Plugins on how to install plugins manually.

Download latest release Latest release 2013-06-16c
Download bug fixed former release dw_searchpattern_latest.tar.gz

Release history

Version Date Download link Comment
0.1 2010/01/07 dw_searchpattern_0_1.tar.gz Initial release
0.2 2010/07/05 dw_searchpattern_0_2.tar.gz Add capability to restrict search area
Retro-compatible with ver 0.1
0.2a 2013/04/15 dw_searchpattern_0_2a.tar.gz Fix a bug preventing user with special ACLs to view results
Prevent PHP warnings to appear by extensive test on variables
0.2b 2013/06/17 dw_searchpattern_0_2b.tar.gz Fix a bug returning a NULL for $INFO that was prenventing using the short '-' & '-:' options
Seems to have appear with latest Dokuwiki release
2013-04-11 2013/04/11 dokuwiki-searchpattern_20130411.zip Use the newest version
add regex match output with new option $ (dollar)
add callback handler method to call other plugin for formatting the output with new option _ (underscore)
remove getInfo() call because it's done by plugin.info.txt (since dokuwiki 2009-12-25 “Lemming”)
reformat inline documentation of syntax.php file
check compatibility with dokuwiki release 2012-10-13 “Adora Belle” and 2013-03-06 “Weatherwax RC1”
add description / comments and syntax howto about integration with dokuwiki plugin 'todo'
bugfix: encoding html code (security risk <script>alert('hi')</script>) when using regex and match output $ (dollar) option.
change description / comments and syntax howto about integration with dokuwiki plugin 'todo'
2013-04-12 2013/04/12 dokuwiki-searchpattern_20130412.zip Bugfix incorrect if statement using $ syntax
Example using FIXME and $3,1
Bugfix: use parameter call_plugin_handler and fallback to lowercase if plugin not found
2013-04-15 2013/04/15 dokuwiki-searchpattern_20130415.zip Bugfix incorrect XHTML or PHP warnings in log
Bugfix quickaclcheck only handles pages (not namespaces)
Translation of language file to german
2013-06-16 2013/06/16 dokuwiki-searchpattern_20130616.zip Bugfix fr language (french translation) file
2013-06-16b 2013/06/16 dokuwiki-searchpattern_20130616b.zip Bugfix admin bad-options setting if using call plugin handler option.
add new admin setting 'dispheadl' to hide search options/regex headline in result table
2013-06-16c 2013/06/16 dokuwiki-searchpattern_20130616c.zip Additional french translations - thanks to Frédéric

Usage Summary

All single quotes used inside the search pattern shall be doubled (i.e. ' {single quote} becomes '' {2 successive single quotes}).

~~SEARCHPATTERN:'my_string'~~ (search "my_string" occurrences in a case sensitive way)
~~SEARCHPATTERN;'my_string'~~ (search "my_string" occurrences in a case insensitive way)
~~SEARCHPATTERN#'my_regex'~~ (search "my_regex" occurrences - regex shall be a PHP one, i.e. PCRE regex with opening and closing slashes)

Since ver 0.2 release, options can be passed by adding them between 2 double interrogation marks at the end of the search pattern.

~~SEARCHPATTERN(:|;|#)'my_search'??option1 option2 option3??~~ (search "my_search" taking into account "option1", "option2" and "option3".

Please see the usage guide below in this document to obtain deeper explanations about available options.

Since ver 2013-04-11 release, the integration with new version of Todo Plugin is available.
Use the following searchpattern expressions to display all todos on a single page.

A detailed documentation with screenshots can be found at http://www.eibler.at/dokuwiki-todo/index_en.php#searchpatternintegration

Use this searchpattern expression for open todos:

 ~~SEARCHPATTERN#'/<todo[^#>]*>.*?<\/todo[\W]*?>/'?? _ToDo ??~~

Use this searchpattern expression for completed todos:

 ~~SEARCHPATTERN#'/<todo[^#>]*#[^>]*>.*?<\/todo[\W]*?>/'?? _ToDo ??~~

Use this searchpattern expression to display all assigned todos:

 ~~SEARCHPATTERN#'/<todo[^@>]*@([^\W]+)[^#>]*(#)?[^>]*>(.*?)<\/todo[\W]*?>/'?? $2,1,3 ??~~
Ver. 2013-04-11 overview over all tasks on single page using searchpattern plugin

DON'T FORGET TO TURN OFF CACHING

 ~~NOCACHE~~

List of available options

Option Plugin version Plugin behavior Comment
+page 0.2+ Search will be restricted to page “page”
+namespace: 0.2+ Search will be restricted to namespace “namespace” Don't forget the ':' char at the end of the name
+ 0.2+ Search will be restricted to current page
+: 0.2+ Search will be limited to current namespace
-page 0.2+ Page “page” will be excluded from the search
-namespace: 0.2+ Namespace “namespace” will be excluded from the search Don't forget the ':' char at the end of the name
- 0.2+ Current page will be excluded from the search
-: 0.2+ Current namespace will be excluded from the search
$<List>
(Dollar)
2013-04-11+ display the defined matches of regular expression. e.g. $1,3,2 will display the first, third and second match of the regex only works in regex mode (#)
_<PluginName>
(Underscore)
2013-04-11+ call the method _searchpatternHandler of the specified Plugin to display the regex search results

Screenshots

These screenshots are done to show all possible warning/error messages but you can choose to display them or not.

Ver 2013-04-12: Target plugin does not support _searchpatternHandler() method

Ver. 2013-04-12 screenshot

Ver 0.2

Ver. 0.2 screenshot

Ver 0.1

Ver. 0.1 screenshot

Usage Guide

Plugin is called by inserting a special pattern inside the page at the position where the result table shall appear. SearchPattern supports 2 types of search : standard or regex.
All single quotes used inside the pattern to search shall be doubled for proper work of the plugin.
Also notice that search is performed on the raw text of pages (not on the rendered one).

In this mode, the plugin will simply look for all matching occurrences of the raw pattern string.
Characters '*' and '?' are interpreted as “star” and “interrogation mark” and NOT as wildcards.
2 types of search are allowed in standard mode : case sensitive and case insensitive.

The plugin is called with following syntax :

~~SEARCHPATTERN:'my_string'~~

Notice pattern is surrounded by starting and ending quotes.

Examples of use :

~~SEARCHPATTERN:'FIXME'~~ (pages where "FIXME" is used)
~~SEARCHPATTERN:'don''t'~~ (pages where "don't" is used - single quote is doubled)

The plugin is called with following syntax :

~~SEARCHPATTERN;'my_string'~~

Notice pattern is surrounded by starting and ending quotes.

Examples of use :

~~SEARCHPATTERN;'FIXME'~~ (pages where "FIXME","fixme","Fixme","FixMe","fIxMe",... are used)
~~SEARCHPATTERN;'don''t'~~ (pages where "don't","Don't","Don'T","dON't",... are used - single quote is doubled)

In this mode, the plugin will look for all matching occurrences of the pattern regex.
Regex format shall be compatible with the “preg_match” PHP function (i.e. PCRE regex with the opening and closing slashes).
Only modifiers 'm', 's', 'i', 'x' and 'g' can be added at the end of the regex.

The plugin is called with following syntax :

~~SEARCHPATTERN#'my_regex'~~

Notice pattern is surrounded by starting and ending quotes.

Examples of use :

~~SEARCHPATTERN#'/[dw]on\''t/i'~~ (pages where "don't" or "won't" are used, case insensitive - single quote is doubled)~~

In RegEx mode there is an option $ (dollar) available which will output the regex match instead of the amount of matches on each page. Using $<Comma separated list of matches> will output only the selected matches.
Example of use

~~SEARCHPATTERN#'/FIXME[ \t]+([^ \t\n\r]+)([ \t]*)?([^ \t\n\r]+)?/i'?? $ ??~~
   will output all matches (including the matches in brackets)

www.eibler.at_dokuwiki-todo_14-en_searchpattern_dollar-list_option.jpg

   
~~SEARCHPATTERN#'/FIXME[ \t]+([^ \t\n\r]+)([ \t]*)?([^ \t\n\r]+)?/i'?? $3,1 ??~~
   will output only match from 3rd and 1st bracket (in this order third, first)

www.eibler.at_dokuwiki-todo_15-en_searchpattern_dollar-listmatch_option.jpg

Integration with other Modules _searchpatternHandler()

To use the new functionality (option _) to output the regex matches by using a handler method in an other syntax plugin.

/*
** @brief this function can be called by dokuwiki plugin searchpattern to process the todos found by searchpattern.
** use this searchpattern expression for open todos: ~~SEARCHPATTERN#'/<todo[^#>]*>.*?<\/todo[\W]*?>/'?? _ToDo ??~~
** use this searchpattern expression for completed todos: ~~SEARCHPATTERN#'/<todo[^#>]*#[^>]*>.*?<\/todo[\W]*?>/'?? _ToDo ??~~
** this handler method uses the table and layout with css classes from searchpattern plugin
** @param $type	string type of the request from searchpattern plugin (wholeoutput, intable:whole, intable:prefix, intable:match, intable:count, intable:suffix)
**             	wholeoutput     = all output is done by THIS plugin (no output will be done by search pattern)
**             	intable:whole   = the left side of table (page name) is done by searchpattern, the right side of the table will be done by THIS plugin
**             	intable:prefix  = on the right side of table - THIS plugin will output a prefix header and searchpattern will continue it's default output
**             	intable:match   = if regex, right side of table - THIS plugin will format the current outputvalue ($value) and output it instead of searchpattern
**             	intable:count   = if normal, right side of table - THIS plugin will format the current outputvalue ($value) and output it instead of searchpattern
**             	intable:suffix  = on the right side of table - THIS plugin will output a suffix footer and searchpattern will continue it's default output
** @param $renderer	object current rendering object (use $renderer->doc .= 'text' to output text)
** @param $data	array whole data multidemensional array( array( $page => $countOfMatches ), ... )
** @param $matches	array whole regex matches multidemensional array( array( 0 => '1st Match', 1 => '2nd Match', ... ), ... )
** @param $page	string id of current page
** @param $params	array the parameters set by searchpattern (see search pattern documentation)
** @param $value	string value which should be outputted by searchpattern
** @return bool true if THIS method is responsible for the output (using $renderer->doc) OR false if searchpattern should output it's default
*/
function _searchpatternHandler( $type, &$renderer, $data, $matches, $params=array(), $page=null, $value=null ) {
	... some code ...
	// output something using  $renderer->doc
	$type = strtolower( $type );
	switch( $type ) {
		case 'wholeoutput':
			// matches should hold an array with all <todo>matches</todo> or <todo #>matches</todo>
			if( !is_array($matches) ) {
				return false;
			}
			$renderer->doc .= 'matches='.print_r( $matches, true );
			// true means, that this handler method does the output (searchpattern plugin has nothing to do)
			return true;
			break;
		case 'intable:whole':
			break;
		case 'intable:prefix':
			//$renderer->doc .= '<b>Start on Page '.$page.'</b>';
			break;
		case 'intable:match':
			//$renderer->doc .= 'regex match on page '.$page.': <pre>'.$value.'</pre>';
			break;
		case 'intable:count':
			//$renderer->doc .= 'normal count on page '.$page.': <pre>'.$value.'</pre>';
			break;
		case 'intable:suffix':
			//$renderer->doc .= '<b>End on Page '.$page.'</b>';
			break;
		default:
			break;
	}
	// false means, that this handler method does not output anything. all should be done by searchpattern plugin
	return false;
      }
      

For an example the Todo Plugin since version 2013-04-11 file syntax.php implemented the handler output method.

Here is a simple example which will output all todos from all sites using the todo plugin (option _todo)

      ~~SEARCHPATTERN#'/<todo[^>]*>.*?<\/todo[\W]*?>/'?? _todo ??~~

Options

Plugin behavior can me modified by passing it options. Options are given inside 2 double interrogation marks placed at the end of the pattern, just after the last single quote.

General use :

~~SEARCHPATTERN(:|;|#)'my_search'??option1 option2 option3??~~

Restricting search area

With the appropriate options, it is possible to limit the search area in two different ways :

  • Restrict the search area by giving the list of the pages/namespaces where the search shall be performed (called restrictive). This is obtained by adding as option a '+' sign followed by the namespace/page to be added to the restriction list.
  • Restrict the search area by giving a list of pages/namespaces that shall be excluded from the search (called exclusive). This is obtained by adding as option a '-' sign followed by the namespace/page to be added to the restriction list.

Obviously, it is not possible to use both ways at the same time. If you try to do so, the restrictive way will be the only one to be considered, and the exclusive parameters will be ignored.

Option Plugin version Plugin behavior Comment
+page 0.2+ Search will be restricted to page “page”
+namespace: 0.2+ Search will be restricted to namespace “namespace” Don't forget the ':' char at the end of the name
+ 0.2+ Search will be restricted to current page
+: 0.2+ Search will be limited to current namespace
-page 0.2+ Page “page” will be excluded from the search
-namespace: 0.2+ Namespace “namespace” will be excluded from the search Don't forget the ':' char at the end of the name
- 0.2+ Current page will be excluded from the search
-: 0.2+ Current namespace will be excluded from the search

Example of use :

~~SEARCHPATTERN(:|;|#)'my_search'?? +namespace: +page ??~~ (Search will be limited to namespace "namespace" and to page "page")

Administration

Through Admin Menu

SearchPattern plugin will add an entry in the “admin” menu allowing to set up its behavior in special cases.
Actually there are 5 cases where plugin behavior can be changed among described settings

  1. When single quote hasn't been doubled inside a pattern (both standard and regex search)
    1. The request can be simply ignored and then the original text is displayed
    2. The request can be caught and an error message is displayed (request is not processed and no result is displayed)
    3. The request can be caught and a warning message is displayed (request is processed anyway and results are displayed)
    4. The request can be caught and processed normally (fault is ignored)
  2. When a regex is not conform (regex search only)
    1. The request can be simply ignored and then the original text is displayed
    2. The request can be caught and an error message is displayed (request is not processed and no result is displayed)
  3. When options are used
    1. Used options are displayed in the result table
    2. Used options are not displayed in the result table
  4. When a bad (unknown) option is used
    1. The request can be simply ignored and then the original text is displayed
    2. The request can be caught and an error message is displayed (request is not processed and no result is displayed)
    3. The request can be caught and a warning message is displayed (request is processed anyway and results are displayed)
    4. The request can be caught and processed normally (fault is ignored)
  5. When an option shall be ignored (mainly because another option is incompatible)
    1. The request can be caught and a warning message is displayed (request is processed anyway and results are displayed)
    2. The request can be caught and processed normally (fault is ignored)
  6. Configure if the headline with the search options should be displayed or not
    1. The search options (e.g. regex code) is displayed as table headline
    2. The table headline is hidden (no search options/query/regex code is displayed)

Through Configuration File

In the file conf/default.php of plugin installation folder, settings are done with 5 variables :

$conf['ndqerr'] represents case 1 of previous paragraph.
'nocatch', 'error', 'warning', 'nowarn' respectively represent behavior 1.I, 1.II, 1.III and 1.IV of previous paragraph.

$conf['regerr'] represents case 2 of previous paragraph.
'nocatch' and 'error' respectively represent behavior 2.I and 2.II of previous paragraph.

$conf['option'] represents case 3 of previous paragraph.
'disp' and 'nodisp' respectively represent behavior 3.I and 4.II of previous paragraph.

$conf['badopt'] represents case 4 of previous paragraph.
'nocatch', 'error', 'warning', 'nowarn' respectively represent behavior 4.I, 4.II, 4.III and 4.IV of previous paragraph.

$conf['ignopt'] represents case 5 of previous paragraph.
'warning' and 'nowarn' respectively represent behavior 5.I and 5.II of previous paragraph.

$conf['dispheadl'] represents case 6 of previous paragraph.
'disp' and 'nodisp' respectively represent behavior 6.I and 6.II of previous paragraph.

Other Things

SearchPattern plugin CSS style is fully customizable through the style.css file, but be aware that standard DokuWiki table style is applied first (for simple integration).

Translations

SearchPattern plugin is currently translated in English, French and German. Feel free to post here other languages.

Improvement Ideas / Todo

  1. Search restriction by namespaces/pages exclusion
  2. Search restriction by namespaces/pages limitation
  3. Macro way to exclude calling page and namespace (is it really useful)?
  4. Some result caching system
  5. Contextual output (what are user expectations?) — using $ (dollar) option syntax by leiblerleibler

    v2013-04-15
  6. Implement custom handler method mechanism to use other plugins for output — by leiblerleibler

    v2013-04-11
  7. Integration with Todo plugin — by leiblerleibler

    v2013-04-11
  8. Documentation: Hi leibler, you mention “A new option allows you to hide the table headline with search options (e.g. regex code) in result table”. Please state the name of the option, I'd like to use it in a general regex (no '_' plugin). -tw_bert- — please use the admin menu configuration setting “dispheadl” - a specific handler can overrule this setting because it has full control over the output. I modified the description to clarify. by leiblerleibler

    v2014-02-02

Discussions

You can here bug reports, self patches, feature requests and any other information/comment related to this plugin.

That's a nice plugin. I use it to simulate Structured data plugin, but instead of requiring to have one data definition by page, it allows me to tag multiple sections of a page with a keyword (QUESTION, TODO…) and easily present them in a table.

Namespace support

Is it possible to restrict the search to some namespaces only? For instance, I'm working on two different documentations, doc1 and doc2, where I have left some TODO in the pages. I want to be able to see the TODOs by documentation, not on the whole wiki.

Actually it is not currently feasible. I thought to that but it made the plugin “handling” pattern hard to detect and format. But maybe in a future release. ;-)


Detailed output

Instead of having a count of occurrences by page, it would be nice to be able to display the text of occurrences with context, in the result table.

If I take the TODO example above, I would like to know for each occurrence in a page what I need to do, so displaying 50 characters before and after the search term would allow me to jump to the page with the most critical TODO.


Same thing as above. :-) I also think it should be something useful, but only if it is a “live” option (you can en/dis-able it each time you use the plugin). So the problem is the same as for namespace exclusion. Find a simple way to declare it in the plugin calling expression. I'll try to implement it in the next release also.

Notice that if you or somebody else implements it or other functionality, I'll be happy to integrate it in the plugin. ;-)
Implemented in v2013-04-11 Outputting all or only the listed regex matches by new option $ (dollar). ?? $ ?? will output the whole match(es), ?? $1,4 ?? will output only the 1st and 4th match. — by leiblerleibler

v2013-04-11
Exclude page with search

It would be useful if the page that is holding the search is excluded from the search.

Actually, it is. ;-)
Not completely to be fully correct, but the text that triggers the search is excluded (in its plain text format). ;-)
You can have strange behavior in “preview” mode (when the page isn't yet saved). But once saved everything should be ok.

Anyway thanks for your feedback. And if your problem is to really ignore the whole page, then it will maybe come with the “namespace restriction” feature in a potential future release. ;-)

MR.
:?: How is this different than pagequery?
First thing is that SearchPattern is older, so “pagequery” wasn't existing when I start programming. Then I wrote this plugin because I need it, and just want to share it. :-)
I have a quick look at “pagequery” plugin and below are the main differences I can see :
* SearchPattern is done to perform the basic task of finding a pattern inside a wiki page text, not to sort wiki pages in any way.
* SearchPattern seems to me simpler to use (but it's my opinion).
* SearchPattern can search any (really any) pattern. For example, in “pagequery”, searching the string ”@root: *” will result in a strange result.
* Finally, I think the most difference is the concept : “pagequery” is done to do everything (making it complex to use and more probably bugged) when SearchPattern is done to perform very well a basic task.
That's a quick summary of my opinion after spending some minutes on the “pagequery” web page. Maybe I'm wrong. ;-)
Anyway, I think there are enough differences so that you can easily choose between the 2, according to your needs.
Thanks for that question, hope my answer satisfy you.

MR.
Well, its good to hear that your plugin has less bugs! ;-) I think that you can do a full regular expression search in pagequery by using the “fullregex” option, which is similar to what your plugin does. Just my 2c…
I can't be sure my plugin has less bug. But it is less complex and also less “all-in-one”. And I carefully test everything before releasing. So I do my best to deliver a bug-free software. MR

leiblerleibler

2013/04/15 19:58

Hi - the plugin is very useful for me.
I'm experiencing some issues with the exclusion option - page_exclude. If I use just '-' as requested I get an error on line 159 in Syntax.php. If I specify the full page path, then it works fine. I think I'm using the options correctly. Thanks anton

Could you post your exact call line to searchpattern as well as error message (either the one displayed in the browser or the one displayed on PHP log) ? - MR -
Hi Anton, I fixed this bug for former release branch in version 0.2b. I'll tell Leo about the bug and he will probably do the same for newer version. Bye. -MR-

Hi How to exclude some result in a regex request ? for exemple I want with the todo pluging to have all todo exept assigned todos ? Thanks for this usefull pluging :-D Doumé

plugin/searchpattern.txt · Last modified: 2014/02/02 17:26 by leibler