Translations of this page?:

offline-dokuwiki.sh

offline-dokuwiki.sh is a simple script to export the content of your dokuwiki documentation into a offline browseable documentation.

It uses wget to retrieve recursively the different documents, and some sed magics to make it browseable offline.

Here is a little pros&cons of using this script:

The pros:

  • works locally even with the nicer url rewrite (ns1:subns2:page)
  • run on the client side so ACLs are honored
    • works out-of-the_box even for big DokuWikis

The cons:

  • uses the external plugin indexmenu to generate a fullindex - not mandatory if you already have a full index somewhere -
  • password is given on the command line which might be unsecure on multi-user system - but the script is easy to fix if that's a problem -
  • export may be incomplete if there is no page with a full index of the different pages and if you(as a user) don't have any way to make one
  • no way to export only part of the wiki - there are other tools to do just that -

Erf, more cons than pros, I start wondering if this page will really be usefull to anyone but me… Oh, and one more pro:

  • uses gnused (for the -i) (what? that's not a pro? c'mon!)
Updated:
20110221
   * add the option --depth to change the maximum level of recursivity for wget(which now defaults in the script to 2)
   * reformat the help message


20110215

   * add the switch '--ms-filenames' so generated filenames are windows compatible
   * add the possibility to specify extra wget options via the environnement variable ''AWO'', e.g.:
      AWO="--proxy-user=USER --proxy-password=PASSWORD" offline-dokuwiki.sh --login samlt --passwd XXXXXX --hostname mydoku.wiki.lan
   * add the switch '--https' to use HTTPS instead of HTTP

installation

script

download the following script and make it executable

offline-dokuwiki.sh
#!/bin/sh
# author: samlt
# 20110221
 
# default values
DEF_HOSTNAME=mydoku.wiki.lan
#DEF_LOCATION=path/to/start
DEF_LOCATION=fullindex
USERNAME=
PASSWORD=
PROTO=http
DEF_DEPTH=2
ADDITIONNAL_WGET_OPTS=${AWO}
PROGNAME=${0##*/}
 
show_help() {
   cat<<EOT
 
NAME
   $PROGNAME: make an offline export of a dokuwiki documentation
 
SYNOPSIS
   $PROGNAME options
 
OPTIONS
   --login      username
   --passwd     password
   --ms-filenames
   --https
   --depth      number
   --hostname   doku.host.tld
   --location   path/to/start
 
NOTES
   if not specified on the command line
      * username and password are empty
      * hostname defaults to '$DEF_HOSTNAME'
      * location defaults to '$DEF_LOCATION'
 
EOT
}
 
while [ $# -gt 0 ]; do
   case "$1" in
      --login)
         shift
         USERNAME=$1
         ;;
      --passwd)
         shift
         PASSWORD=$1
         ;;
      --hostname)
         shift
         HOSTNAME=$1
         ;;
      --depth)
         shift
         DEPTH=$1
         ;;
      --location)
         shift
         LOCATION=$1
         ;;
      --https)
         PROTO=https
         ;;
      --ms-filenames)
         ADDITIONNAL_WGET_OPTS="$ADDITIONNAL_WGET_OPTS --restrict-file-names=windows"
         ;;
      --help)
         show_help
         exit
         ;;
   esac
   shift
done
 
: ${DEPTH:=$DEF_DEPTH}
: ${HOSTNAME:=$DEF_HOSTNAME}
: ${LOCATION:=$DEF_LOCATION}
 
PREFIX="$(date +'%Y%m%d')-$HOSTNAME"
 
echo "[WGET] downloading: start: http://$HOSTNAME/$LOCATION (login/passwd=${USERNAME:-empty}/${PASSWORD:-empty})"
wget  --no-verbose \
      --recursive \
      --level="$DEPTH" \
      --execute robots=off \
      --no-parent \
      --page-requisites \
      --convert-links \
      --http-user="$USERNAME" \
      --http-password="$PASSWORD" \
      --auth-no-challenge \
      --adjust-extension \
      --exclude-directories=_detail,_export \
      --reject="feed.php*,*do=backlink.html,*do=edit.html,*do=index.html,*indexer.php?id=*" \
      --directory-prefix="$PREFIX" \
      --no-host-directories \
      $ADDITIONNAL_WGET_OPTS \
      "$PROTO://$HOSTNAME/$LOCATION"
 
 
echo
echo "[SED] fixing links(href...) in the HTML sources"
sed -i -e 's#href="\([^:]\+:\)#href="./\1#g' \
       -e "s#\(indexmenu_\S\+\.config\.urlbase='\)[^']\+'#\1./'#" \
       -e "s#\(indexmenu_\S\+\.add('[^']\+\)#\1.html#" \
       -e "s#\(indexmenu_\S\+\.add([^,]\+,[^,]\+,[^,]\+,[^,]\+,'\)\([^']\+\)'#\1./\2.html'#" \
       ${PREFIX}/*.html

fullindex (optional)

I personnally used the indexmenu to generate such a page (name: fullindex), at the root of the documentation.

Here is the code

{{indexmenu>. | notoc nojs}}

note: if you name this page differently, or use a different index page, don't forget to change the –location option below

usage

Quite simple, start with a basic:

offline-dokuwiki.sh --help

then, you can continue with something like:

offline-dokuwiki.sh --login samlt --passwd XXXX --hostname mydoku.wiki.lan --location fullindex

This will save a offline version of your wiki in a directory YYYYMMDD-hostname, where YYYYMMDD is the date of today

note: if you omit your login/password, then the export will only contain the public accessible pages

To start browsing the export:

cd theRightDir
firefoxOrWhateverYouReUsing start.html

And that's it.

comments/suggestions?

  • As a bonus, you can prepare an iso image from the dump with the command:
    genisoimage -o dokuwiki.iso -r -iso-level 4 $PREFIX

    (fine tune genisoimage options to suit your needs; above works fine with long filenames on windows/linux machines)

  • Be careful to use this script crawls into “media manager” page - it can then delete not linked media files (emulating clicking on the trash icon)
tips/offline-dokuwiki.sh.txt · Last modified: 2011/09/07 15:23 by 87.162.84.91
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 3.0 Unported
Imprint Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki
WikiForumIRCBugsGitXRefTranslate