Differences

This shows you the differences between two versions of the page.

--- tips:pdfexport:htmldoc [2009-10-31 16:57] – Added a recursive version of the "HTMLDOC_variant" 90.184.79.243
+++ tips:pdfexport:htmldoc [2023-03-08 08:21] (current) – 2409:4070:4387:9f60:5d3e:e9b4:db59:c826
@@ Line 150: / Line 150: @@
 header("Content-Disposition: attachment; filename=wikiexport" . str_replace(':','_',$_GET["id"]) . ".pdf");
 </code>
-To retrieve images from the wiki server (relative links, hope that it won't cause security issues) (I had problems with PNG files, so I converted them into JPEG format).
+To retrieve images from the wiki server (relative links, hope that it won't cause security issues) (I had problems with PNG files, so I converted them into JPEG format
 <code xml>
 $text = preg_replace("'<img src=\"/(.*?)/lib/exe/fetch.php(.*?)media=(.*?)\"(.*?)>'si","<img src=\"http://" . $_SERVER['SERVER_NAME'] . "/\\1/data/media/\\3\">", $text); # for uploaded images
@@ Line 520: / Line 520: @@
 header("Content-Disposition: attachment; filename=".str_replace(' ','_',$conf['title']).'-'.end(split('/',$_GET["id"])).".pdf");
 </code>
 ====== HTMLDOC recursive variant ======
-My problem was that i needed support for child page export. It therefore choose to modify / hack [[#An_HTMLDOC_variant]] found on this page. Some of the remarks / improvements to [[#An_HTMLDOC_variant]] has also been included.
+My problem was that i needed support for child page export. It therefore choose to modify / hack [[#An_HTMLDOC_variant]] found on this page. Some of the remarks / improvements to [[#An_HTMLDOC_variant]] have also been included.
-It will thus perform a recursive export of your current page. This means, that any internal links will be followed and converted to PDF too. The internal links should copied to the PDF - meaning that they are click-able like they are in dokuwiki.
+It will thus perform a recursive export of your current page. This means that any internal links will be followed and converted to PDF too. The internal links should copied to the PDF - meaning that they are click-able like they are in dokuwiki.
-  * Follow the first 4 steps of [[#HTMLDOC]] (on this page)
+  * Follow the first steps of [[#An_HTMLDOC_variant]] (on this page)
   * Then insert this into "inc/common.php":<code php>
 function pdfmake($text)
@@ Line 536: / Line 535: @@
   $pdfmake_links = array();
-  // This controls the depth in which it will search for subpages
+  $pdfmake_recursion_level = 30;
-  $pdfmake_recursion_level = 10;
   $pdfmake_recursion_current = 0;
-  // Now search for children.
+	// Now search for children.
-  $text = pdfmake_children($text);
+	$text = pdfmake_children($text);
-  // And create the pdf
+	// And create the pdf
-  pdfmake_inner($text);
+	pdfmake_inner($text);
 }
 function pdfmake_inner($text){
@@ Line 555: / Line 553: @@
 # Convert text and toctitle to destination code-page
-  $text=iconv("utf-8",$conf['pdfcp'],$text);
+  $text=iconv("utf-8",$conf['pdfcp'].'//TRANSLIT',$text);
 # Change toctitle if needed
   if ($conf['customtoc']) {
@@ Line 652: / Line 650: @@
 #convert using htmldoc
-  $command = $conf['htmldocdir'] . "htmldoc " . $pdf . $width . $jpeg . " --charset " . "--webpage" . $pdfcp . " --no-title " . $fontparam . " --toctitle \"" . $toctitle . "\" -f " . $filenameOutput . " " . $filenameInput;
+  $command = $conf['htmldocdir'] . "htmldoc " . $pdf . $width . $jpeg . " --charset ". $pdfcp  . " --no-title " . $fontparam . " --toctitle \"" . $toctitle . "\" -f " . $filenameOutput . " " . $filenameInput;
   system($command);
   system("exit(0)");
@@ Line 661: / Line 660: @@
   header("Content-Disposition: attachment; filename=dokuwikiexport_" . str_replace(':','_',$_GET["id"]) . ".pdf");
   $fd = @fopen($filenameOutput,"r");
+  //Puke on error
+  if($fd == false)
+  {
+    print 'Output file cannot be opened';
+    exit;
+  }
   while(!feof($fd)){
     echo fread($fd,2048);
@@ Line 667: / Line 673: @@
 #clean up temporary files
-  //system("rm " . $filenameInput);
+  system("rm " . $filenameInput);
-  //system("rm " . $filenameOutput);
+  system("rm " . $filenameOutput);
 }
@@ Line 681: / Line 687: @@
   $links = array();
-  $pdfmake_recursion_current += 1;
+ 	$pdfmake_recursion_current += 1;
-  //will contain all subpages at the end.
+ 	//echo 'Current recursion level: ', $pdfmake_recursion_current, '<br>';
-  $innerText = '';
+ 	//will contain all subpages at the end.
+ 	$innerText = '';
-  //find all links on page
-  $regex_pattern = "/<a href=\"(.*)\">(.*)<\/a>/";
+	//find all links on page
-  preg_match_all($regex_pattern,$text,$matches);
+	$regex_pattern = "/<a href=\"(.*)\">(.*)<\/a>/";
+	preg_match_all($regex_pattern,$text,$matches);
   //The matching pairs will be listed in matches[1]. Sort these matches, so that subnamspaces comes before their parent namespaces.
-  sort($matches[1]);
+  //sort($matches[1]);
-  for($i=0; $i< count($matches[1]); $i++) {
+	for($i=0; $i< count($matches[1]); $i++) {
     //extract the internal dokuwiki id of the subpage. This is needed to perform the rendering
     $link = substr($matches[1][$i], stripos($matches[1][$i],'title=')+7);
+   // echo $link, '<br>';
     //Dont add a page which has already been included
     if(!in_array($link, $pdfmake_links)) {
-      // Call the dokuwiki renderer, if the link does not start with http (then it is not an internal link)
+   	  // Call the dokuwiki renderer, if the link does not start with http (then it is not an internal link)
-      if(substr($link, 0, 4) != 'http') {
+   	  if(substr($link, 0, 4) != 'http') {
-      $innerText .= p_wiki_xhtml($link,'',false);
+	      $innerText .= p_wiki_xhtml($link,'',false);
-      //Add the link to the collection so it can be sanitized later.
+        //Add the link to the collection so it can be sanitized later.
-      $pdfmake_links[] = $link;
+        $pdfmake_links[] = $link;
-      $links[] = $link;
+        $links[] = $link;
-      }
+	    }
-    }
+	  }
-  }
+	}
 	//Recurse into the next level of internal links
   if($pdfmake_recursion_current < $pdfmake_recursion_level) {
+      //echo "inside recursion<br>";
     $innerText = pdfmake_children($innerText);
   }
@@ Line 730: / Line 738: @@
 Remember that I only tested this on my own sever (on which it works). So expect bugs and / or strange behavior.
+===== Bug fixes =====
+Here follows a list of fixed bugs
+  * 2009-10-31:
+    * Fixed a bug in the command line which, on some pages, caused the PDF generation to fail.
+    * Fixed a bug with unconvertable UTF8 chars breaking pdf generation (chars like -> and <-)
  --- //[[nicklas.overgaard@gmail.com| Nicklas Overgaard]] 2009-10-31 16:45 GMT+1 //
 ====== HTMLDOC and OS X ======
@@ Line 775: / Line 790: @@
   $filenameOutput=tempnam('','pdf');
 </code>
 ====== HTMLDOC request ======
 I think that will be very useful if you can create a page with the list of wiki page to export and HTMLDOC export all these pages into a PDF file.\\
@@ Line 789: / Line 803: @@
 So you can create pages from which you can extract a PDF file based on more wiki pages
+**Check the** [[#HTMLDOC_recursive_variant]] it should support the requested feature.
 ===== Config problem with HTMLDOC variant =====
@@ Line 798: / Line 814: @@
 you have to declare all value in your ''config.metadata.php''
+===== Changes to the TOC =====
+Some recent changes in the core will break all the TOC-related code above, because [[https://github.com/dokuwiki/dokuwiki/commit/d5acc30de20298eb6ed7545e70484599c4d95867|the HTML for the TOC has been rewritten]]. The changes will be part of DokuWiki from the next release on (autumn 2012).