DokuWiki

It's better when it's simple

User Tools

Site Tools


plugin:scrape

This is an old revision of the document!


scrape Plugin

Compatible with DokuWiki

No compatibility info given!

plugin Include HTML parts from other website into the wiki

Last updated on
2011-06-28
Provides
Syntax
Repository
Source

This extension has not been updated in over 2 years. It may no longer be maintained or supported and may have compatibility issues.

Tagged with html, include, jquery

This plugin allows you to include HTML scraped from a different website. The part to include can be specified by a jQuery-like expression. To prevent abuse all HTML is purified against malicious code and only whitelisted URLs can be scraped.

Installation

Install the plugin using the Plugin Manager and the download URL above, which points to latest version of the plugin. Refer to Plugins on how to install plugins manually.

Configuration

All URLs that should be scrapable have to be defined through a regular expression in the config.

Syntax/Usage

The general syntax is: {{scrape>url query|title}}.

url is the URL of the website you want to scrape. It must be matched by the regular expression given in the config.

query is the jQuery like query to select a page element on the given website. See the phpQuery manual for the available selectors. When you end your query with a ~ the innerHTML of the match will be used, otherwise the matched wrapping element itself will be part of the output. When no query is given, body ~ is used.

title is only used when your query matched the URL of an image file. In that case the image will be embedded and the given title be added. You can leave out the title.

FIXME examples should be added

plugin/scrape.1404546289.txt.gz · Last modified: 2014-07-05 09:44 by andi

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki