Sourcerer Implementation Notes
From Geohashing
Revision as of 15:10, 9 February 2015 by imported>Sourcerer (→Added Category)
Contents
Plan of Campaign
My plan is to create some new analysis tools for this wiki. This page might help others to implement their tools.
I'm using a standard Windows 7 PHP 5.3.5 installation and running the examples from the command line. Linux or other PHPs should work the same.
- Get some page content ("Hello World!").
- Write some code (perhaps PHP command line code) to do the same. Make sure this does not overload the wiki.
- Download different kinds of page.
- Do some simple statistics on the downloaded data.
- Create reports in wiki markup.
- If it works, upload them to the wiki.
Various Downloads
Help Page
JSON Page Content Dump
List first twenty of Consecutive geohash achievement
List twenty more of Consecutive geohash achievement
PHP Code - List first twenty of Consecutive geohash achievement - Command line application
<?php $html = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); echo $html; ?>
PHP Code - As above but use JSON
<?php $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?action=query&format=json&list=categorymembers&cmtitle=Category:Consecutive_geohash_achievement&cmlimit=20'); $readable = json_decode($json, true); print_r($readable); ?>
PHP Expedition as JSON
<?php $json = file_get_contents('http://wiki.xkcd.com/wgh/api.php?format=json&action=query&titles=2015-02-07_52_1&prop=revisions&rvprop=content'); $readable = json_decode($json, true); print_r($readable); ?>
PHP Get every page title for a specified category
This fetches 100 page titles at a time with 5 second pauses between fetches to avoid loading the wiki too much.
<?php // ======================================================================= // === These constant values must be set before running the script ======= // ======================================================================= define("API_URL", 'http://wiki.xkcd.com/wgh/api.php'); // Search any wiki page source for "api.php" to find this path on other MediaWiki sites // ======================================================================= // Return an array containing the cmcontinue value followed by page titles // Index [0] contains cmcontinue needed to get the next page // Index [1] up to [cmlimit] contains the page titles, could be zero items // ======================================================================= function getTitlesInCategory($cmtitle, $cmlimit = 10, $cmcontinue = "") { if ($cmcontinue == "") { $continue = ""; } else { $continue = "&cmcontinue=$cmcontinue"; } $url = API_URL . "?action=query&format=json&list=categorymembers&cmtitle=$cmtitle&cmlimit=$cmlimit$continue"; $json = file_get_contents($url); $decodedjson = json_decode($json, true); echo $url . "\n"; $titles = array(); if (isset($decodedjson['query-continue'])) { $titles[] = $decodedjson['query-continue']['categorymembers']['cmcontinue']; // Next page if there is one } else { $titles[] = ""; } foreach($decodedjson['query']['categorymembers'] as $value) { $titles[] = $value['title']; } return $titles; } // ======================================================================= // ======================================================================= // Return an array containing ALL page titles for the category // ======================================================================= function getAllTitlesInCategory($cmtitle) { $allTitles = array(); $titles = getTitlesInCategory($cmtitle, 100); // print_r($titles); $cmcontinue = $titles[0]; unset($titles[0]); $allTitles = array_merge($allTitles, $titles); while($cmcontinue != "") { sleep(5); $titles = getTitlesInCategory($cmtitle, 100, $cmcontinue); // print_r($titles); $cmcontinue = $titles[0]; unset($titles[0]); $allTitles = array_merge($allTitles, $titles); } return $allTitles; } // ======================================================================= // ======================================================================= // main program // ======================================================================= print_r(getAllTitlesInCategory("Category:Consecutive_geohash_achievement")); ?>