Difference between revisions of "Sourcerer Implementation Notes"
From Geohashing
imported>Sourcerer m (→PHP Example) |
imported>Sourcerer m (→More PHP Code) |
||
Line 2: | Line 2: | ||
My plan is to create some new analysis tools for this wiki. This page might help others to implement their tools. | My plan is to create some new analysis tools for this wiki. This page might help others to implement their tools. | ||
+ | |||
+ | I'm using a standard Windows 7 PHP installation and running the examples from the command line. Linux or other PHPs should work the same. | ||
# Get some page content ("Hello World!"). | # Get some page content ("Hello World!"). | ||
Line 28: | Line 30: | ||
=== PHP Code - List first twenty of Consecutive geohash achievement - Command line application === | === PHP Code - List first twenty of Consecutive geohash achievement - Command line application === | ||
− | <?php | + | <nowiki><?php |
$html = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); | $html = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); | ||
echo $html; | echo $html; | ||
− | + | ?></nowiki> | |
=== PHP Code - As above but use JSON === | === PHP Code - As above but use JSON === | ||
− | <?php | + | <nowiki><?php |
− | + | $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&format=json&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); | |
− | + | $readable = json_decode($json, true); | |
− | + | print_r($readable); | |
− | + | ?></nowiki> | |
=== PHP Expedition as JSON === | === PHP Expedition as JSON === | ||
− | <?php | + | <nowiki><?php |
− | + | $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?format=json&action=query&titles=2015-02-07_52_1&prop=revisions&rvprop=content'); | |
− | + | $readable = json_decode($json, true); | |
− | + | print_r($readable); | |
− | + | ?></nowiki> | |
− | === PHP Get every page title for specified category === | + | === PHP Get every page title for a specified category === |
− | This fetches 100 pages at a time with 5 second pauses between fetches to avoid loading the | + | This fetches 100 pages at a time with 5 second pauses between fetches to avoid loading the wiki too much. |
<nowiki><?php | <nowiki><?php |
Revision as of 13:19, 9 February 2015
Contents
Plan of Campaign
My plan is to create some new analysis tools for this wiki. This page might help others to implement their tools.
I'm using a standard Windows 7 PHP installation and running the examples from the command line. Linux or other PHPs should work the same.
- Get some page content ("Hello World!").
- Write some code (perhaps PHP command line code) to do the same. Make sure this does not overload the wiki.
- Download different kinds of page.
- Do some simple statistics on the downloaded data.
- Create reports in wiki markup.
- If it works, upload them to the wiki.
Various Downloads
Help Page
JSON Page Content Dump
List first twenty of Consecutive geohash achievement
List twenty more of Consecutive geohash achievement
PHP Code - List first twenty of Consecutive geohash achievement - Command line application
<?php $html = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); echo $html; ?>
PHP Code - As above but use JSON
<?php $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&format=json&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); $readable = json_decode($json, true); print_r($readable); ?>
PHP Expedition as JSON
<?php $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?format=json&action=query&titles=2015-02-07_52_1&prop=revisions&rvprop=content'); $readable = json_decode($json, true); print_r($readable); ?>
PHP Get every page title for a specified category
This fetches 100 pages at a time with 5 second pauses between fetches to avoid loading the wiki too much.
<?php // ======================================================================= // === These constant values must be set before running the script ======= // ======================================================================= define("API_URL", 'http://wiki.xkcd.com/wgh/api.php'); // Search any wiki page source for "api.php" to find this path on other MediaWiki sites // ======================================================================= // Return an array containing the cmcontinue value followed by page titles // Index [0] contains cmcontinue needed to get the next page // Index [1] up to [cmlimit] contains the page titles, could be zero items // ======================================================================= function getTitlesInCategory($cmtitle, $cmlimit = 10, $cmcontinue = "") { if ($cmcontinue == "") { $continue = ""; } else { $continue = "&cmcontinue=$cmcontinue"; } $url = API_URL . "?action=query&format=json&list=categorymembers&cmtitle=$cmtitle&cmlimit=$cmlimit$continue"; $json = file_get_contents($url); $decodedjson = json_decode($json, true); echo $url . "\n"; $titles = array(); if (isset($decodedjson['query-continue'])) { $titles[] = $decodedjson['query-continue']['categorymembers']['cmcontinue']; // Next page if there is one } else { $titles[] = ""; } foreach($decodedjson['query']['categorymembers'] as $value) { $titles[] = $value['title']; } return $titles; } // ======================================================================= // ======================================================================= // Return an array containing ALL page titles for the category // ======================================================================= function getAllTitlesInCategory($cmtitle) { $allTitles = array(); $titles = getTitlesInCategory($cmtitle, 100); // print_r($titles); $cmcontinue = $titles[0]; unset($titles[0]); $allTitles = array_merge($allTitles, $titles); while($cmcontinue != "") { sleep(5); $titles = getTitlesInCategory($cmtitle, 100, $cmcontinue); // print_r($titles); $cmcontinue = $titles[0]; unset($titles[0]); $allTitles = array_merge($allTitles, $titles); } return $allTitles; } // ======================================================================= // ======================================================================= // main program // ======================================================================= print_r(getAllTitlesInCategory("Category:Consecutive_geohash_achievement")); ?>