DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:docimporter

docimporter Plugin

Compatible with DokuWiki

Weatherwax

plugin Import your word documents (.doc and .docx) into DokuWiki

Last updated on
2013-11-17
Provides
Action
Repository
Source

This extension has not been updated in over 2 years. It may no longer be maintained or supported and may have compatibility issues.

Tagged with docx, import, word

This plugin imports your Microsoft Word documents (.doc or .docx) into DokuWiki and preserve the following properties of the word document :

  • Basic layout : italic, bold, underlined text.
  • Bullet and numbered list with sublevels.
  • The table of content will be generated correctly if the word document uses the right headings formats.
  • Pictures are imported in their native size but displayed in the height and width specified in the word document.
  • Tables are imported.
  • Footnotes are imported.

Installation

Install this plugin at your own risk, there is absolutely no guarantee that it will work correctly and will not make gremlins eat you alive.

Please note that this plugin has been developed on Linux (Ubuntu/Debian) and is not tested on MS Windows or any other OS. I have absolutely no plans to support any other OS than Linux.

:!: External requirements: This plugin requires the following additional components that must be installed separately, they are given as Ubuntu packages name :

  • php-pear
  • libreoffice-writer
  • libreoffice-wiki-publisher

Search and install the plugin using the Extension Manager. Refer to Plugins on how to install plugins manually.

You will also need to create the file :

 /usr/bin/convert_to_mediawiki 

That must contain the following :

 #/bin/bash
 soffice --nofirststartwizard --headless --convert-to html:"HTML" $1
 soffice --nofirststartwizard --headless --convert-to txt:"MediaWiki" $1

You must also make it executable :

 sudo chmod +x /usr/bin/convert_to_mediawiki
 

Finally, the plugin uses the XML-RPC remote API, therefore you need to enable it and specify a username and a password. For more information on how to do that, please refer to https://www.dokuwiki.org/devel:xmlrpc.

Once the remote API is enabled for a given username and password, fill in the corresponding fields into the options of the plugin in the configuration settings of Dokuwiki.

Now you can add a button and a form to your template to be able to upload new documents. Here is an example for the “dokuwiki” template.

Edit the file

 lib/tpl/dokuwiki/tpl_header.php
 

And add the following lines between the “<ul>” and “</ul>” tags just below the comment <!-- SITE TOOLS -->:

 <?php if ( auth_quickaclcheck( $ID ) >= AUTH_EDIT ): ?>
                <li><input class="btn" value="Import a new document" onclick="var el=document.getElementById('import_form');el.style.display=(el.style.display!='none'?'none':'' );" /></li>
 <?php endif ?>

And add the following lines just before the comment <!-- BREADCRUMBS --> :

 <?php if ( auth_quickaclcheck( $ID ) >= AUTH_EDIT ): ?>
    <div style="display:none;margin-top:50px;padding-top:20px;padding-bottom:20px;" id="import_form">
      <form method="POST" action="" enctype="multipart/form-data"><br>
         <label>Wiki page title : </label> <input type="text" name="title"><br>
         <label>Wiki name space : </label> <input type="text" name="ns"><br>
         <label>Word document : </label> <input type="file" name="doc"><br>
         <input type="hidden" name="id" value="".$ID.""/>
         <input type="hidden" name="do" value="doc2dw"/>
         <input type="submit" class="btn " value="Import to the wiki">
      </form>
    </div>
 <?php endif ?>
 

Usage

Users who can edit or create documents will see the button “Import a new document”. A click on it shows the form with the following fields :

  • Wiki page title (mandatory) : the title of the page once imported.
  • Wiki name space (mandatory) : the namespace, ex “wiki”.
  • Word document (mandatory) : select the file you want to import.

Then click on “Import to the wiki” and wait to be redirected to the newly created page.

Change Log

This is an early release, there will be some bugs left in it. Please check https://github.com/marginweb/dokuwiki-docimporter for current development.

ToDo/Wish List

  • Better error handling.
  • Give more feedback to the user in case of errors.

Support

This plugin is provided as is, no personal help will be provided. On the other hand, genuine bug reports are welcome, please post them on github.

Commercial support can be provided on demand, please contact us at www.marginweb.com with your request.

Discussion

This plugin imports my Microsoft Word documents! Actually Windows Microsoft Word. It would make more sense to develop and value a plugin in his own environment, where Actually I need, more then other Linux users who don't have MS Word, in the first place. I have tons of MS Documents What workflow do you advice us (windows users)?f l o r i n k o -gmail.

No Change to xmlrpc.php

The tpl_header.php has been updated and then restarted apache. The page still only shows:

XML-RPC server accepts POST requests only.

no page created

After importing a docx document, the page is redirected to the new pagelink, but the page is empty and not yet created.

I installed docimporter with your above hints.

For my debian vserver the following worked:

added to /etc/apt/sources.list: (http://www.debian.org/News/2011/20110623.de.html)

deb http://backports.debian.org/debian-backports squeeze-backports main

installed libreoffice modules (https://wiki.debian.org/LibreOffice)

apt-get install -t squeeze-backports uno-libs3 # removes openoffice.org-*
        (don't know if necessary)
apt-get install -t squeeze-backports libreoffice-writer
apt-get install -t squeeze-backports libreoffice-wiki-publisher

I did all the other stuff (/usr/bin/convert_to_mediawiki, chmod etc.)

Any idea how to find out what is going wrong? Joe, 07.12.2013

Response to Joe - no page created

In my case i onliest need to create a link /var/www/dokuwiki because i use /var/www/wiki and it works. Bruno Emanuel. 11.02.2014

Another response!

In our case it was due to the server not being able to access the postback URL from the plugin itself. Under action.php you'll find the following:

$client = new IXR_Client('http://localhost/dokuwiki/lib/exe/xmlrpc.php');

If your web server is set up to not be able to access internal PHP files via localhost, this might break. (Let's leave aside the hard-coded wiki path for the moment: you can symlink that problem away!). What you can try is setting the domain explicitly:

$client = new IXR_Client('http://yoursite.com/dokuwiki/lib/exe/xmlrpc.php');

Or just letting PHP do the work for you:

$client = new IXR_Client( “http:⁄⁄” . $_SERVER[“SERVER”] . “⁄dokuwiki⁄lib⁄exe⁄xmlrpc.php” );

Try that, and see if that works. Chris. 01.15.2015

Similar trouble

Similarly, the plugin fails to create a new page and I get the following error in my server's error log:

2016/08/01 15:22:08 [error] 2990#2990: *890 FastCGI sent in stderr: "PHP message: PHP Warning:  Declaration of action_plugin_docimporter::register($controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /var/www/html/dokuwiki/lib/plugins/docimporter/action.php on line 0" while reading response header from upstream, client: 192.168.200.87, server: _, request: "GET /dokuwiki/doku.php?id=start&do=admin&page=config HTTP/1.1", upstream: "fastcgi://unix:/var/run/php/php7.0-fpm.soc:", host: "192.168.200.144", referrer: "http://192.168.200.144/dokuwiki/doku.php?id=start&do=admin&page=config"

Chris's suggestion did not help. Instead, it just caused my wiki to display completely blank pages. Reverting the change to the plugin's action.php fixed that problem, fortunately. Any other ideas? Does this plugin still work for anyone? Thanks. — l3lackEyedAngels 2016-08-01 23:38

No Images shown in Imported Page

When importing .docx Word files I had a well formatted page added to the wiki, but with {{wiki:}} instead of images/pictures. After digging through the process, I found that the regex responsible for matching the images was case sensitive (my HTML img tags were lower case). I made this change to case-insensitive and everything started working:

Line 95 in /var/www/dokuwiki/lib/plugins/docimporter/ImportUtils.php from:

preg_match_all("/<IMG SRC=...>/", $myHTMLContent, $image_tags);

to:

preg_match_all("/<IMG SRC=...>/i", $myHTMLContent, $image_tags);

I have a further issue in that the HTML output of the LibreOffice conversion is putting the images inline in base64 data URIs instead of external jpg/png files and Dokuwiki doesn't support them - but at least the text is usable and has been formatted properly. Charles, Aug 20, 2014

Confirmed - Images do not import with this plugin. Have tried 4.2.7 (default for 14.04), as well as the latest beta - 4.4.x. None seem to parse image export correctly. This is a real shame :( Have banged my head against this plugin for too long.

If anyone knows of any way to actually import a basic document with a few images properly, I would really love to know of a good way. It is rather tedious to have to manually save each image, and then upload. Oh well. At this point, I might as well just link to the original document. This is, sadly, the best option I can think of without spending the time to manually “re-invent the wheel” here.

Also, the dev has stopped maintaining this plugin. Abandon all hope, ye who enter here. -somedude - Dec.18.2014

I have the same problem than you … since October. We need a plugin like docimporter, but for the moment, we haven't solution. Fire24 - 19/01/2015

The plugin works very well. Thank you for it. Just few installation notes:

  • Make sure that XML-RPC remote API is configured correctly
  • Make sure that XML-RPC user have enough rights to create documents in the wiki
  • Make sure that Libre(Open)Office is 4.1, or older. Since 4.2, images are embedded to the html
  • May be needed to edit ImportUtils.php
diff ImportUtils.phpOLD ImportUtils.php
113c113
<         array_push($image_names, "{{".$left_align."image:".$image_tags[1][$i]."?".$image_tags[3][$i]."x".$image_tags[4][$i].$righ_align."}}");
---
>         array_push($image_names, "{{".$left_align.":".$image_tags[1][$i]."?".$image_tags[3][$i]."x".$image_tags[4][$i].$righ_align."}}");

maaca - 16.4.2015

pgrep_match_all

Does not work here :(
Error with pgrep_match_all line 298 in ImportUtils.php
preg_match_all(): Unknown modifier 'y'
No clue …

Bert- 11.01.2016

plugin/docimporter.txt · Last modified: 2023-10-29 13:16 by Klap-in

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki