DokuWiki

It's better when it's simple

User Tools

Site Tools


tips:moinmoin2doku

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tips:moinmoin2doku [2011-02-06 18:09] – merge third python script, use wiki history to retrieve older versions glentips:moinmoin2doku [2023-09-01 18:31] (current) – [Discussion] glen
Line 1: Line 1:
 ====== Migration from MoinMoin to DokuWiki ====== ====== Migration from MoinMoin to DokuWiki ======
  
 +Code written in Python/PHP that can do full conversion: all pages and their history, including edit log. Also copies attachments.
  
-Below you will find scripts in PHP and Python to facilitate the conversion processBefore running them you must eliminate the leftmost ">" in <>/code> and <>code> in the Python convert_page functions, or remove the ">>"'s in the $replace Array(... in PHP scripts.+Checkout files from [[https://github.com/glensc/moin2doku]] and run: 
 +  * ''./moin2doku.py -a -d **<DokuWiki installation dir>**''
  
-> FIXME Are there any parameters that need to be passed to the PHP script and how is that to be done?  According to the code there should be three parameters Passed through the URL?  Syntax?  Can anyone help?+Consult README for details.
  
-Another document on switching appears at http://www.emilsit.net/blog/archives/migrating-from-moinmoin-to-dokuwiki/+**NB:** The project is no longer maintained (as I got my wiki converted), but it may work for you (at least better starting point than I had), feel free to send Pull-Requests --- [[user>glen|glen]] //2015-04-08 16:06// 
 +===== Other and older versions =====
  
-===== PHP ===== +You can dig the older versions or older variants from [[https://www.dokuwiki.org/tips:moinmoin2doku?rev=1300625072|old page revision]] if really needed.
- +
-I have written a small PHP script to convert wiki pages from MoinMoin [[http://moinmoin.wikiwikiweb.de/]] to DokuWiki syntax. It does not take care of all differences, but it worked for me. +
- +
-<code php moin2doku.php> +
-#!/usr/bin/php +
-<?php +
- +
-//check comman line parameters +
-if ($argc != 3 || in_array($argv[1], array('--help', '-help', '-h', '-?'))) { +
-  echo "\n  Converts all files from given directory\n"; +
-  echo "  from MoinMoin to DokuWiki syntax. NOT RECURSIV\n\n"; +
-  echo "  Usage:\n"; +
-  echo "  ".$argv[0]." <input dir> <output dir>\n\n"; +
-}  +
-else { +
-  //get input and output directories +
-  $inDir = realpath($argv[1]) or die("input dir error"); +
-  $outDir = realpath($argv[2]) or die("output dir error"); +
-  //just print information +
-  echo "\nInput Directory: ".$inDir."\n"; +
-  echo "Output Directory: ".$outDir."\n\n"; +
- +
-  //get all files from directory +
-  if (is_dir($inDir)) { +
-    $files = filesFromDir($inDir); +
-  } +
-   +
-  //migrate each file +
-  foreach ($files As $file) { +
-    //convert filename +
-    $ofile = convFileNames($file); +
-    //just print information +
-    echo "Migrating from ".$inDir."/".$file." to ".$outDir."/".$ofile."\n"; +
-     +
-    //read input file +
-    $text = readFl($inDir."/".$file); +
-     +
-    //convert content +
-    $text = moin2doku($text); +
-     +
-    //encode in utf8 +
-    $text = utf8_encode($text); +
-     +
-    //write output file +
-    writeFl($outDir."/".$ofile, $text); +
-  } +
-+
- +
- +
-function moin2doku($text) { +
-  /* like convFileNames and more +
-  *   ToDo: [[Datestamp]] delete? +
-  *         bold and italic, what goes wrong? +
-  *         images +
-  *         Problems with newline and [[BR]] +
-  *         CamelCase in Heading: it will be converted +
-  *         Moin handles code sections without closing }}} right, DokuWiki does not      +
-  */ +
-   +
-  //line by line +
-  $lines = explode("\n", $text); +
-  foreach($lines As $line) { +
-    //start converting +
-    $find = Array(   +
-                  '/\[\[TableOfContents\]\]/',      //remove +
-                  '/\[\[BR\]\]$/',                  //newline at end of line - remove +
-                  '/\[\[BR\]\]/',                   //newline +
-                  '/#pragma section-numbers off/',  //remove +
-                  '/\["(.*)"\]/',                   //internal link +
-                  '/(\[http.*\])/',                 //web link +
-                  '/\{{3}/',                        //code open +
-                  '/\}{3}/',                        //code close +
-                  '/^\s\*/',                        //lists must have not only but 2 whitespaces before * +
-                  '/={5}(\s.*\s)={5}$/',            //heading 5 +
-                  '/={4}(\s.*\s)={4}$/',            //heading 4 +
-                  '/={3}(\s.*\s)={3}$/',            //heading 3 +
-                  '/={2}(\s.*\s)={2}$/',            //heading 2 +
-                  '/={1}(\s.*\s)={1}$/',            //heading 1 +
-                  '/\|{2}/',                        //table separator +
-                  '/\'{5}(.*)\'{5}/',               //bold and italic +
-                  '/\'{3}(.*)\'{3}/',               //bold +
-                  '/\'{2}(.*)\'{2}/',               //italic +
-                  '/(?<!\[)(\b[A-Z]+[a-z]+[A-Z][A-Za-z]*\b)/',  //CamelCase, dont change if CamelCase is in InternalLink +
-                  '/\[\[Date\(([\d]{4}-[\d]{2}-[\d]{2}T[\d]{2}:[\d]{2}:[\d]{2}Z)\)\]\]/'  //Date value  +
-                  ); +
-    $replace = Array( +
-                     '',                            //remove                                 +
-                     '',                            //newline remove                                 +
-                     '\\\\\ ',                      //newline +
-                     '',                            //remove                                 +
-                     '[[${1}]]',                    //internal link +
-                     '[${1}]',                      //web link +
-                     '<>>code>',                      //code open - remove >>, its included for viewing in DokuWiki +
-                     '<>>/code>',                     //code close - remove >>, its included for viewing in DokuWiki +
-                     '  *',                         //lists must have 2 whitespaces before * +
-                     '==${1}==',                      //heading 5                         +
-                     '===${1}===',                    //heading 4                         +
-                     '====${1}====',                  //heading 3                         +
-                     '=====${1}=====',                //heading 2                         +
-                     '======${1}======',              //heading 1                         +
-                     '|',                           //table separator                        +
-                     '**//${1}//**',                //bold and italic +
-                     '**${1}**',                    //bold                                   +
-                     '//${1}//',                    //italic +
-                     '[[${1}]]',                    //CamelCase +
-                     '${1}'                         //Date value +
-                     ); +
-    $line = preg_replace($find,$replace,$line); +
-     +
-    $ret = $ret.$line."\r\n"; +
-  } +
-  return $ret; +
-+
- +
- +
-function convFileNames($name) { +
-  /* ö,ä,ü, ,. and more +
-  */ +
-  $find = Array('/_20/', +
-                '/_5f/', +
-                '/_2e/', +
-                '/_c4/', +
-                '/_f6/', +
-                '/_fc/', +
-                '/_26/', +
-                '/_2d/' +
-                ); +
-  $replace = Array('_', +
-                   '_', +
-                   '_', +
-                   'Ae', +
-                   'oe', +
-                   'ue', +
-                   '_', +
-                   '-' +
-                   ); +
-  $name = preg_replace($find,$replace,$name); +
-  $name = strtolower($name); +
-  return $name.".txt"; +
-+
- +
- +
-function filesFromDir($dir) { +
-  $files = Array(); +
-  $handle=opendir($dir); +
-  while ($file = readdir ($handle)) { +
-     if ($file != "." && $file != ".." && !is_dir($dir."/".$file)) { +
-         array_push($files, $file); +
-     } +
-  } +
-  closedir($handle);  +
-  return $files; +
-+
- +
-function readFl($file) { +
-  $fr = fopen($file,"r"); +
-  if ($fr) { +
-    while(!feof($fr)) { +
-      $text = $text.fgets($fr); +
-    } +
-    fclose($fr); +
-  } +
-  return $text; +
-+
- +
-function writeFl($file, $text) { +
-  $fw = fopen($file, "w"); +
-  if ($fw) { +
-    fwrite($fw, $text); +
-  } +
-  fclose($fw); +
-+
- +
-?> +
-</code> +
- +
-===== Python ===== +
- +
-Based on the above PHP version, a Python script that automates the file renaming, copying and conversion business. +
-Can do attachments moving, convert attachment code, creates namespaces based on structure in MoinMoin-Wiki. +
-also converts some Codes of German 'Umlaute'+
- +
-Remember to change '<>code>' and '<>/code>' to %%<code>%% and %%</code>%%. +
- +
-Save as ''moin2doku.py'' and run: +
-  * ./moin2doku.py <pages folder of MoinMoin-Wiki> <pages folder of DokuWiki> +
- +
-<code python moin2doku.py> +
-#!/usr/bin/python +
-+
-# moin2doku.py +
-+
-# A script for converting MoinMoin version 1.3+ wiki data to DokuWiki format. +
-# Call with the name of the directory containing the MoinMoin pages and that +
-# of the directory to receive the DokuWiki pages on the command line: +
-+
-# python moin2doku.py ./moin/data/pages/ ./doku/ +
-+
-# then move the doku pages to e.g. /var/www/MyWikiName/data/pages/, +
-# move the media files to e.g. /var/www/MyWikiName/data/media/, +
-# set ownership: chown -R www-data:www-data /var/www/MyWikiName/data/pages/+
-# chown -R www-data:www-data /var/www/MyWikiName/data/media/+
-+
-# This script doesn't do all the work, and some of the work it does is +
-# wrong. For instance attachment links end up with the trailing "|}}" +
-# on the line following the link. This works, but doesn't look good. +
-# The script interprets a "/" in a pagename as a namespace delimiter and +
-# creates and fills namespace subdirectories accordingly. +
-+
-# version 0.1  02.2010  Slim Gaillard, based on the "extended python" +
-#                       convert.py script here: +
-#                       http://www.dokuwiki.org/tips:moinmoin2doku +
-+
-import sys, os, os.path, re, pdb +
-from os import listdir +
-from os.path import isdir, basename +
- +
-def check_dirs(moin_pages_dir, output_dir): +
-    if not isdir(moin_pages_dir): +
-        print >> sys.stderr, "MoinMoin pages directory doesn't exist!" +
-        sys.exit(1) +
- +
-    if not isdir(output_dir): +
-        print >> sys.stderr, "Output directory doesn't exist!" +
-        sys.exit(1) +
- +
-def get_path_names(moin_pages_dir): +
-    items = listdir(moin_pages_dir) +
-    pathnames = [] +
- +
-    for item in items: +
-        item = os.path.join(moin_pages_dir, item) +
-        if isdir(item): +
-            pathnames.append(item) +
- +
-    return pathnames +
- +
-def get_current_revision(page_dir): +
-    rev_dir = os.path.join(page_dir, 'revisions'+
-    if isdir(rev_dir): +
-        revisions = listdir(rev_dir) +
-        revisions.sort() +
-        return os.path.join(rev_dir, revisions[-1]) +
-    return '' +
- +
-def copy_attachments(page_dir, attachment_dir): +
-  dir = os.path.join(page_dir,'attachments'+
-  if isdir(dir): +
-    attachments = listdir(dir) +
-    #pdb.set_trace() +
-    for attachment in attachments: +
-      cmd_string = 'cp "' + dir +'/' + attachment + '" "' + attachment_dir + attachment.lower() + '"' +
-      os.system ( cmd_string ) +
- +
-def convert_page(page, file): +
-    namespace = ':' +
-    for i in range(0, len(file) - 1): +
-      namespace += file[i] + ':' +
- +
-    regexp = ( +
-        ('\[\[TableOfContents.*\]\]', ''),          # remove +
-        ('\[\[BR\]\]$', ''),                        # newline at end of line - remove +
-        ('\[\[BR\]\]', '\n'),                       # newline +
-        ('#pragma section-numbers off', ''),        # remove +
-        ('^##.*?\\n', ''),                          # remove +
-        ('\["', '[['),                              # internal link open +
-        ('"\]', ']]'),                              # internal link close +
-        #('\[:(.*):',  '[[\\1]] '),                 # original internal link expressions +
-        #('\[\[(.*)/(.*)\]\]',  '[[\\1:\\2]]'), +
-        #('(\[\[.*\]\]).*\]', '\\1'), +
-        ('\[(http.*) .*\]', '[[\\1]]'),             # web link +
-        ('\["/(.*)"\]', '[['+file[-1]+':\\1]]'), +
-        ('\{{3}', '<>code>'),                        # code open +
-        ('\}{3}', '<>/code>'),                       # code close +
-        ('^\s\s\s\s\*',       *'), +
-        ('^\s\s\s\*',     *'), +
-        ('^\s\s\*',   *'), +
-        ('^\s\*', *'),                           # lists must have 2 whitespaces before the asterisk +
-        ('^\s\s\s\s1\.',     -'), +
-        ('^\s\s1\.',   -'), +
-        ('^\s1\.', -'), +
-        ('^\s*=====\s*(.*)\s*=====\s*$', '=-=- \\1 =-=-'),           # heading 5 +
-        ('^\s*====\s*(.*)\s*====\s*$', '=-=-=- \\1 =-=-=-'),         # heading 4 +
-        ('^\s*===\s*(.*)\s*===\s*$', '=-=-=-=- \\1 =-=-=-=-'),       # heading 3 +
-        ('^\s*==\s*(.*)\s*==\s*$', '=-=-=-=-=- \\1 =-=-=-=-=-'),     # heading 2 +
-        ('^\s*=\s*(.*)\s=\s*$', '=-=-=-=-=-=- \\1 =-=-=-=-=-=-'),    # heading 1 +
-        ('=-', '='), +
-        ('\|{2}', '|'),                             # table separator +
-        ('\'{5}(.*)\'{5}', '**//\\1//**'),          # bold and italic +
-        ('\'{3}(.*)\'{3}', '**\\1**'),              # bold +
-        ('\'{2}(.*)\'{2}', '//\\1//'),              # italic +
-        ('(?<!\[)(\b[A-Z]+[a-z]+[A-Z][A-Za-z]*\b)','[[\\1]]'),  # CamelCase, dont change if CamelCase is in InternalLink +
-        ('\[\[Date\(([\d]{4}-[\d]{2}-[\d]{2}T[\d]{2}:[\d]{2}:[\d]{2}Z)\)\]\]', '\\1'),  # Date value +
-        ('attachment:(.*)','{{'+namespace+'\\1|}}') +
-    ) +
- +
-    for i in range(len(page)): +
-        line = page[i] +
-        for item in regexp: +
-            line = re.sub(item[0], item[1], line) +
-        page[i] = line +
-    return page +
- +
-def print_help(): +
-    print "Usage: moinconv.py <moinmoin pages directory> <output directory>" +
-    print "Convert MoinMoin pages to DokuWiki." +
-    sys.exit(0) +
- +
-def print_parameter_error(): +
-    print >> sys.stderr, 'Incorrect parameters! Use --help switch to learn more.' +
-    sys.exit(1) +
- +
-def fix_name( filename ): +
-    filename = filename.lower() +
-    filename = filename.replace('(2d)', '-'         # hyphen +
-    filename = filename.replace('(20)', '_'         # space->underscore +
-    filename = filename.replace('(2e)', '_'         # decimal point->underscore +
-    filename = filename.replace('(29)', '_'         # )->underscore +
-    filename = filename.replace('(28)', '_'         # (->underscore +
-    filename = filename.replace('.', '_'            # decimal point->underscore +
-    filename = filename.replace('(2c20)', '_'       # comma + space->underscore +
-    filename = filename.replace('(2028)', '_'       # space + (->underscore +
-    filename = filename.replace('(2920)', '_'       # ) + space->underscore +
-    filename = filename.replace('(2220)', 'inch_'   # " + space->inch + underscore +
-    filename = filename.replace('(3a20)', '_'       # : + space->underscore +
-    filename = filename.replace('(202827)', '_'     # space+(+'->underscore +
-    filename = filename.replace('(2720)', '_'       # '+ space->underscore +
-    filename = filename.replace('(c3bc)', 'ue'      # umlaut +
-    filename = filename.replace('(c384)', 'Ae'      # umlaut +
-    filename = filename.replace('(c3a4)', 'ae'      # umlaut +
-    filename = filename.replace('(c3b6)', 'oe'      # umlaut +
-    return filename +
- +
-+
-# "main" starts here +
-+
-if len(sys.argv) > 1: +
-    if sys.argv[1] in ('-h', '--help'): +
-        print_help() +
-    elif len(sys.argv) > 2: +
-        moin_pages_dir = sys.argv[1] +
-        output_dir = sys.argv[2] +
-    else: +
-        print_parameter_error() +
-else: +
-    print_parameter_error() +
- +
-check_dirs(moin_pages_dir, output_dir) +
- +
-print 'Input dir is: %s.' % moin_pages_dir +
-print 'Output dir is: %s.' % output_dir +
- +
-pathnames = get_path_names(moin_pages_dir) +
- +
-for pathname in pathnames: +
-    #pdb.set_trace() # start debugging here +
- +
-    curr_rev = get_current_revision( pathname ) +
-    if not os.path.exists( curr_rev ) : continue +
- +
-    page_name = basename(pathname) +
-    if page_name.count('MoinEditorBackup') > 0 : continue # don't convert backups +
- +
-    curr_rev_desc = file(curr_rev, 'r'+
-    curr_rev_content = curr_rev_desc.readlines() +
-    curr_rev_desc.close() +
- +
-    page_name = fix_name( page_name ) +
- +
-    split = page_name.split('(2f)') # namespaces +
- +
-    count = len(split) +
- +
-    dateiname = split[-1] +
- +
-    dir = output_dir +
-    # changed from attachment_dir = output_dir + '../media/': +
-    attachment_dir = output_dir + 'media/' +
-    if not isdir (attachment_dir): +
-      os.mkdir(attachment_dir) +
- +
-    if count == 1: +
-      dir += 'unsorted' +
-      if not isdir (dir): +
-        os.mkdir(dir) +
- +
-      attachment_dir += 'unsorted/' +
-      if not isdir (attachment_dir): +
-        os.mkdir(attachment_dir) +
- +
-    for i in range(0, count - 1): +
- +
-      dir += split[i] + '/' +
-      if not isdir (dir): +
-        os.mkdir(dir) +
- +
-      attachment_dir += split[i] + '/' +
-      if not isdir (attachment_dir): +
-        os.mkdir(attachment_dir) +
- +
-    if count == 1: +
-      str = 'unsorted/' + page_name +
-      split = str.split('/'+
-      curr_rev_content = convert_page(curr_rev_content, split) +
-    else: +
-      curr_rev_content = convert_page(curr_rev_content, split) +
- +
-    out_file = os.path.join(dir, dateiname + '.txt'+
-    out_desc = file(out_file, 'w'+
-    out_desc.writelines([it.rstrip() + '\n' for it in curr_rev_content if it]) +
-    out_desc.close() +
- +
-    # pdb.set_trace() # start debugging here +
-    copy_attachments(pathname, attachment_dir) +
-</code> +
- +
-===== Perl ===== +
- +
-I've written more powerful conversion script, now it converts correctly (as I think ;-) ) all syntax from [[http://moinmo.in/HelpOnEditing]] except tables (now it doesn't convert aligning and spans). You can get latest version [[http://help.ubuntu.ru/wiki/moinmoin2dokuwiki|here]], just copy all code from codeblock and replace %%<!/code>%% with %%</code>%%.+
  
 ===== Discussion ===== ===== Discussion =====
- 
  
 > Why did you switch from MoinMoin to DokuWiki?  Just curious, I'm debating between the two and MoinMoin's WYSIWYG editor is very nice, and big sites like fedoraproject.org and ubuntu.com are using MoinMoin.  - posted on 1/16/2006 > Why did you switch from MoinMoin to DokuWiki?  Just curious, I'm debating between the two and MoinMoin's WYSIWYG editor is very nice, and big sites like fedoraproject.org and ubuntu.com are using MoinMoin.  - posted on 1/16/2006
 >> Because MoinMoin is **not as stable** as it looks like? You know the [[http://www.ubuntuusers.de/ikhaya/443/|Ubuntuusers Wiki]]-case? - posted on 04/26/2007 >> Because MoinMoin is **not as stable** as it looks like? You know the [[http://www.ubuntuusers.de/ikhaya/443/|Ubuntuusers Wiki]]-case? - posted on 04/26/2007
->>> I've add Perl script which convert all syntax from [[http://moinmo.in/HelpOnEditing]]Please, report [[malamut@ubuntu.ru|me]] all errors if you found them.+ 
 +Has anyone used this successfully to convert from MoinMoin 1.9.*? 
 +>> I needed to migrate, because I am upgrading my servers from Debian-buster to Debian-bookworm. 
 +>> I tried to modify moin2doku to use it with MoinMoin 1.9, but I was not able, because there have been to many changes in MoinMoin. But I found a workaround. I have installed a migration-KVM with:  
 +>>  * Debian-stretch from https://cdimage.debian.org/cdimage/archive/9.13.0/ 
 +>>I have manually added the following software 
 +>>  * python 2.5.6 from https://www.python.org/ftp/python/2.5.6/Python-2.5.6.tgz 
 +>>  * MoinMoin 1.5.9 from https://master.dl.sourceforge.net/project/moin/moin/1.5.9/moin-1.5.9.tar.gz 
 +>>  * DokuWiki 2017-02-19e from https://download.dokuwiki.org/src/dokuwiki/dokuwiki-2017-02-19e.tgz 
 +>>I was able to copy my Debian-buster MoinMoin 1.9 data to this migration-KVM and convert it with moin2doku and after that copy it to my Debian-bookworm dokuwiki 0.0.20220731.a-2. But I had to make some >>changes in moin2doku. I remember: 
 +>>  * doku.php: deleted line ''require_once DOKU_INC.'inc/cliopts.php';'' 
 +>>  * moin2doku.py and moinformat.py: added line ''%%from __future__ import with_statement%%'' 
 +>> Thank You very much for moin2doku
tips/moinmoin2doku.1297012166.txt.gz · Last modified: 2011-02-06 18:09 by glen

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki