DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:solr

solr Plugin

Compatible with DokuWiki

Anteater, Rincewind, Angua

plugin Index and search DokuWiki pages with an external Solr server

Last updated on
2012-01-30
Provides
Helper, Action
Repository
Source

This extension has not been updated in over 2 years. It may no longer be maintained or supported and may have compatibility issues.

Tagged with index, search

Installation

Installation has three stages - basic installation, Solr configuration and integration with your DokuWiki template.

For basic installation search and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.

For Solr configuration install and configure a Solr server with the instructions from the Solr Wiki (http://wiki.apache.org/solr/FrontPage ). For testing purposes you can use the example environment in the `example` folder of the Solr distribution. On the command line, change to the example folder and type

  java -jar start.jar

The `schema.xml` file that comes with this plugin can be used as your starting point for creating a Solr search schema. Consider adding stop words to the stop word list. Also consider using language-specific stemmers and domain-specific synonyms.

If your Solr server is not located at the URL http://localhost:8983/solr you have to open the DokuWiki configuration page and set the Solr URL in the input field for the Solr plugin.

The last step of the installation is the integration with your DokuWiki template, i.e. replacing the standard wiki search field with the Solr search field. Open the `main.php` file of you template and look for the code

   <?php tpl_searchform(); ?>

Replace it with the following code:

  <?php 
    $solr =& plugin_load("helper", "solr");
    $solr->tpl_searchform(true, false); // Search field with ajax and no autocomplete
  ?> 
  

This will create a search form where the search terms are searched in the content of the document. Search terms that occur in the first headline (the page title) or the first paragraph (the page abstract) will give the result document a higher ranking. Wildcards are automatically added at the end of each search term.

The plugin provides a form for advanced search where users can search for exact phrases, exclude words, search inside page titles, abstracts and namespaces and search for specific authors. To create a button to the advanced search form, place the following code at the appropriate place in your `main.php`:

  <?php
    $solr =& plugin_load("helper", "solr");
    $solr->htmlAdvancedSearchBtn();
   ?>

Indexing your wiki

Indexing all pages

You can call the command line script `index_all.php` that comes with this plugin to index all of your wiki pages. The speed of this script greatly depends on your server speed, so you should not have an execution time limiit for PHP scripts started on the command line.

Indexing individual pages

Each page is also indexed when it is visited by a user. See the next section on how the indexing mechanism works.

The indexing mechanism

After installing the plugin it will index every page using the DokuWiki indexing mechanism: An invisible graphic that calls the file `lib/exe/taskrunner.php`. `taskrunner.php` issues an event which is handled by the Solr plugin if the page was modified since it was last indexed. After the plugin has indexed a page, it creates a file with the suffix `.solr_indexed` in the page's meta directory. If the modification date of this file is greater than the page modification date, the plugin does nothing and the other indexing actions specified in `taskrunner.php` are taken.

ToDo/Wish List

  • Searching by date: At the moment the creation and modification date is indexed but there is no way to search for it. This is because DokuWiki has no API for creating localized date entry fields and I don't want to create such a field myself.
  • Raw Search: A separate search field that Doesn't add wild cards and where you can use every Lucene search operator available (fuzzy proximity search, OR operator, etc).
  • Searching PDF, DOC etc would be great.
    • According to the SOLR website, this already is one of the features of SOLR: “Rich Document Parsing and Indexing (PDF, Word, HTML, etc) using Apache Tika”. Is it just a question of telling the indexer where the mediafiles are located? Maybe a look at docsearch plugin could be of help. Docsearch works but does not return previews of the search results (just the link to the file) and no backlinks to the page where the file is used.
    • It would be fantastic if Dokuwiki could index mediafiles for the search functions as this would very much enhance its value as a tool for knowledge management.

FAQ

Q: Does this plugin use the PECL solr extension?
A: No, this plugin has no dependency on the PECL extension. Instead, it communicates via HTTP requests with the Solr server.

Q: I am using Solr 6.6.0 running index_all.php gives me no indexed files:

sudo -u apache php /opt/dokuwiki/lib/plugins/solr/index_all.php
PHP Warning:  Declaration of action_plugin_solr::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /opt/dokuwiki/lib/plugins/solr/action.php on line 0
PHP Warning:  require_once(/opt/dokuwiki/inc/cliopts.php): failed to open stream: No such file or directory in /opt/dokuwiki/lib/plugins/solr/index_all.php on line 30
PHP Fatal error:  require_once(): Failed opening required '/opt/dokuwiki/inc/cliopts.php' (include_path='.:/usr/share/pear:/usr/share/php') in /opt/dokuwiki/lib/plugins/solr/index_all.php on line 30
  • Plugin Configuration

plugin»solr»url = http://127.0.0.1:8983/solr/dokuwiki

The Solr Web-Interface shows:

Last Modified:    -
Num Docs:    0
Max Doc:    0
Heap Memory Usage:    0
Deleted Docs:    0
Version:    2
Segment Count:    0

Is there configuration on the Plugin-Side missing?

You see the warnings and the error text. a) you should change the code to action_plugin_solr::register($controller) (remove the ampersand) - however, the code should work as is; b) please check if the file /opt/dokuwiki/inc/cliopts.php exists (and is readable for the user the web server is running as). The error results from this missing file. And because of the error, nothing is indexed. — Werner Flamme 2017-09-07 20:45

I found a /opt/dokuwiki/inc/cliopts.php file, it found my pages, but not able to import:

Imported 7 pages in 0.065 seconds

The following pages encountered an error while importing:

playground:playground
wiki:dokuwiki
wiki:start
wiki:syntax
wiki:welcome
start
start_test

Is it possible to switch to debug mode and get some more output?

plugin/solr.txt · Last modified: 2022-10-20 18:05 by Klap-in

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki