Difference between revisions of "User:AperfectBot"

From Geohashing
imported>Aperfectring
(2009-06-20)
(under new management)
 
(78 intermediate revisions by 7 users not shown)
Line 1: Line 1:
This bot is owned by [[User:Aperfectring|aperfectring]].  It is an implementation of pywikipediabot, and uses some code <s>stolen from</s> graciously donated by [[User:ReletBot|relet]]. Its job is to maintain the future and recent past [[Geo Hashing:Current events|planning pages lists]], and to create new planning pages upon request.
+
{{quote||The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.|Eliezer Yudkowsky}}
  
==What should the Bot be named?==
+
This bot was created by [[User:Aperfectring|aperfectring]], who also maintained and hosted it until 2023. Since then, it is hosted by [[User:Fippe|Fippe]]. It is implemented on top of pywikipediabot, and its job is to maintain the daily expeditions lists, as well as the expedition lists on [[Geohashing:Current events]]. It also is able to create both per-user and per-graticule expedition lists.  See [[Help:AperfectBot]] for more information.
The most pressing issue right now is what to call my bot.  Voice your opinion or add new name suggestions below! --[[User:Aperfectring|aperfectring]] 18:55, 12 June 2009 (UTC)
 
*ApeRobot - Vaguely similar to my nick, and my favorite option.
 
*APRBot - Much more representative of who owns it.
 
*RingBot
 
*[[User:AperfectBot|AperfectBot]] - '''Robyn's''' favorite. relet's too. <---Winner
 
  
:Considering that I didn't realize who "APR" was in the chatroom for quite a while, and that the first line on the bot's page will be something like "This is a bot owned by Aperfectring," you might as well go with your first choice. Also: Ringbot, Aperfectbot. (The last is my favourite). -[[User:Robyn|Robyn]] 19:01, 12 June 2009 (UTC)
+
== Much Thanks ==
 
+
* The uses some code <s>stolen from</s> graciously donated by [[User:ReletBot|relet]], and he also provided source control before the move to github.  We have started to merge some code which we were sharing, to create a common library of useful geohashing wiki functions. We hope these functions will help facilitate easier development of new bots, and also improve the overall quality of all bots which use it.
*BotheRing
+
* [[User:Robyn|Robyn]] came up with the original idea, and provided some great input early in its development.
*SpideRing - '''Xore''''s vote. In my opinion, the name has a certain ring to it that I like. --[[User:Xore|Xore]] 21:29, 12 June 2009 (UTC)
+
* [[User:Jiml|Jiml]] hosted and ran the bot while I moved across the country and scrambled for internet access.
 
+
* All others who helped with the planning and suggestions for improvement.
: My contributions --[[User:Xore|Xore]] 20:40, 12 June 2009 (UTC)
+
The bot wouldn't have been completed nearly so quickly, or gotten to be the dumb ape it is without all of you. Thanks.
::That's the other thing I like about wiki. Everyone pitches in to solve important issues. -[[User:Robyn|Robyn]] 22:56, 12 June 2009 (UTC)
 
*johnny
 
  
 
== How it works at the moment I edited this ==
 
== How it works at the moment I edited this ==
It looks at Category:Expedition_planning, and finds all pages in it which have a title that matches: YYYY-MM-DD lat lon
+
It looks at Category:Meetup on YYYY-MM-DD for the most recent days, and all days where coords are available, and finds all pages in it which have a title that matches: YYYY-MM-DD lat lon
 
 
It looks at each of those pages for users, and a location, it also looks up the graticule name from the All_Graticules page.
 
===Users===
 
It looks for a "people" or "participants" header
 
 
 
If found, it assumes one user per line, and lists the users as one of the two things:
 
 
 
*The User:* tag found at the beginning of the line
 
 
 
*The first word of the line
 
 
 
If no header is found, it looks for all User:* tags, and lists all unique occurrences
 
 
 
If at this point, still no user is found, it assumes there is none, and uses the following text: "Unknown, maybe you?"
 
 
 
===Location===
 
It looks for a "location" or "where" header
 
 
 
If found, it takes up to the first 50 characters of the section, and appends ... to the result if the string is more than 50 characters long.
 
  
If not found, it starts at the beginning of the page, and tries that same 50 char thing.
+
It also looks at Category:Expedition_planning for all pages matching YYYY-MM-DD lat lon where the date is further in the future than the latest available coordinates.
  
If still not found, it jumps into the first section and tries again.
+
[[User:AperfectBot/Changing_the_output|Go here]] for information about how it constructs the summary.
  
Finally, if there is still no text, it will use this: "Unknown, why not have a spontaneous adventure?"
+
The bot takes somewhere around 10-15 minutes to complete one iteration.  Most of this time is actually fetching and writing the pages, the actual processing time is rather minimal.
  
*It looks to me like 50 characters isn't quite enough for a lot of the location descriptions, so I will up it to 75, or maybe 100 today.  Also, I will explicitly strip out any section headers which mysteriously weren't trimmed out before. --[[User:Aperfectring|aperfectring]] 11:51, 17 June 2009 (UTC)
+
== Updates to functionality ==
 +
I'll try to keep this up to date with what changes I make on the Ape, but the best place to look is the github repo, as that will be maintained much more often.
  
===Name===
+
=== 2010-04-28 ===
If the name isn't found in All_Graticules, it calls the graticule "Unknown (lat, lon)"
+
I added dynamic support for holidays, instead of the previous static one, which I never added holidays to.  Holidays included are:
 +
* Geohashing Day
 +
* Mouseover Day
 +
* DJIA holidays (for the day they actually fall on, not when DJIA observes them)
 +
* Easter Sunday (Came for free with DJIA Good Friday holiday)
 +
* Pi Day
 +
* Talk Like A Pirate Day
  
===Summary===
+
The ape should be able to identify these holiday on its own forever!
This seems to be able to produce something meaningful for just about every old planning page where something meaningful can be made. --[[User:Aperfectring|aperfectring]] 02:27, 16 June 2009 (UTC)
 
  
 
==Tasks Remaining==
 
==Tasks Remaining==
 
These are in rough order of importance
 
These are in rough order of importance
* Put the code into source control somewhere
+
* Review and possibly improve the fuzzy logic to the library user list function. (ongoing, for continuous improvement)
* Check for '''Template:Maintained''', and don't write to pages which have it.
+
* Improve transport detection.
* Add the ability to have manual updates to the date sections.
+
* Look into either signing up for access to a server, or convince one of our friendly neighborhood geohashers with a server to let me have access to update and run this, as well as the Notification program, from it.
* Parse Meetup on *DATE* pages to look for uncategorized expeditions, and categorize them as Expedition planning.
 
** I will include a comment that this category was added by a bot, and if it does not apply, to add at least one of any other appropriate categories for an expedition page
 
** In the bot, this should be done before parsing the Expedition planning page, so that any new expeditions it finds will be added to the list ASAP.
 
* Create a list of graticule and graticule talk pages on which planning occurs
 
** Create a parsing engine for these pages, to be able to include their plans in the list
 
 
* Sort the results for each day using an undetermined key to sort on
 
* Sort the results for each day using an undetermined key to sort on
 
* Let AperfectBot eat bananas
 
* Let AperfectBot eat bananas
  
 
=== Task scheduling ===
 
=== Task scheduling ===
I will use this section to plan out my time in the evening on tasks.  I will probably put in an hour or two of work on most weekdays.  Anything from before 2009-06-17 is included for historical purposes.
+
I will use this section to plan out my time in the evening on tasks.  I will probably put in an hour or two of work on most weekdays.  Anything from before 2009-06-17 is included for historical purposes.  The bot is live!  Anything I will be doing from now on is new features, bug fixes, or improvements to output.
  
==== 2009-06-20 ====
+
==== 2010-01-27 ====
 +
* The window is now about 10 seconds each run, which is currently lasting about 6 minutes.
  
* Switching to a new set of categories as follows:
+
==== 2010-01-26 ====
** '''Category:Meetup on YYYY-MM-DD''' for anything from the latest available back to the first in the list
+
* Fixed user texts so that there is a much better chance they will be preserved. The window is still measurable, but much smaller than it used to be.
** '''Category:Expedition planning''' for anything further in the future than the latest available
 
* Still looking for the best way to figure out the last the coords are available for.
 
** My current thought is to use the python implementation posted here.
 
* Status: The above is complete.  The bot also will create empty date stubs now.  I am now looking for input on my update below.
 
  
==== 2009-06-19 ====
+
==== 2010-01-14 ====
* More planning on picking the dates to report.
+
[[User:Aperfectring/Expeditions]] is updating with a new format: <pre>
:* My current thought is to report everything from Expedition planning from three weekdays ago, until the latest available coordinates.  This gives people a bit more time to report on a potentially geohash-busy weekend, but means that the number of days in the recent past list is not constant.  This table assumes no DOW holidays.
 
::{| border="1" cellpadding="5" cellspacing="0"
 
|- bgcolor="lightgrey"
 
!Today (US Eastern Time) !! First day reported !! Last day reported
 
|-
 
|Sunday    || Wednesday || Monday
 
|-
 
|Monday    || Wednesday || Tuesday
 
|-
 
|Tuesday  || Thursday  || Wednesday
 
|-
 
|Wednesday || Friday    || Thursday
 
|-
 
|Thursday  || Monday    || Friday
 
 
|-
 
|-
|Friday    || Tuesday  || Monday
+
|DATE||GRATADD||GRATNAME||PEOPLE||REACHED:[[EXPED|Succeeded]]:[[EXPED|Failed]]:REACHED||LOCATION
|-
+
</pre>
|Saturday  || Wednesday || Monday
+
With the text before and after the update area, this results in a sortable table of expeditions!
|}
 
:* Another option is to have a fixed number of past days in the list (let's say 3), and all days where coordinates are available.  This keeps the recent past list a constant size, but if people are busy geohashing on weekends, their expedition planning could drop off the page before it is reported on.  This table assumes no DOW holidays.
 
::{| border="1" cellpadding="5" cellspacing="0"
 
|- bgcolor="lightgrey"
 
!Today (US Eastern Time) !! First day reported !! Last day reported
 
|-
 
|Sunday    || Thursday  || Monday
 
|-
 
|Monday    || Friday    || Tuesday
 
|-
 
|Tuesday  || Saturday  || Wednesday
 
|-
 
|Wednesday || Sunday    || Thursday
 
|-
 
|Thursday  || Monday    || Friday
 
|-
 
|Friday    || Tuesday  || Monday
 
|-
 
|Saturday  || Wednesday || Monday
 
|}
 
* If I get some decent feedback on which of the above is best, I will begin coding on it.
 
* Figure out how to determine what days there are coordinates available for.
 
* Status: I am now using the first option, and parsing both '''Category:Expedition planning''' and '''Category:Expeditions'''.  It now updates about every 7 minutes with the truncated date list.
 
 
 
==== 2009-06-18 ====
 
* Reverse the sort of the dates
 
* Plan out how to pick the dates to report
 
* Status: first point done, second still in progress.
 
 
 
==== 2009-06-17 ====
 
* Tweak the length of location descriptions
 
* Trim out the extra instances of header boundaries in the location descriptions
 
* Begin work on sectionalizing the results by date
 
* Possibly start the bot in a continuous loop, which means that it will provide updates about every 30 minutes, if needed.  I will leave this going overnight and while I am at work the next day, if I do it.
 
* Status: All of the above complete.  Let me know if the bot misbehaves.  If it starts misbehaving really badly, use the [[User:AperfectBot#Distraction Banana|Distraction Banana]] section below.
 
 
 
==== 2009-06-16 ====
 
* Fix up some location parsing.
 
* Status: Did some work on it, but not a whole lot
 
 
 
==== 2009-06-15 ====
 
* Look for more options as far as people going
 
* Look for more options as far as the location the hashpoint is in
 
* Status: The user list may get a little better with time, but its quite close at this point.  There is still work to be done on the location.
 
 
 
==== 2009-06-14 ====
 
* Status: 100% less shouting on the page
 
 
 
==== 2009-06-13 ====
 
* More thorough planning
 
* Begin coding in earnest
 
* Status: By the end of the day, I had a very basic parser, which wrote the full contents of '''Category:Expedition planning''' to [[User:AperfectBot/Test_Page| a page on the wiki]].
 
 
 
==== 2009-06-12 ====
 
* Begin preliminary planning
 
 
 
--[[User:Aperfectring|aperfectring]] 12:05, 17 June 2009 (UTC)
 
  
===Other people's thoughts===
+
Added some new options:
If anyone else has ideas for things to be included in the bot, please put them here. Thanks. --[[User:Aperfectring|aperfectring]] 12:05, 17 June 2009 (UTC)
+
PEOPLE:x - where x is any number - Will display at most x people in the list. Otherwise operates exactly like PEOPLE
 +
TRANSICON - Will display icons for all transport options detected.
 +
REACHICON - Will display either a green (SUCCESS) or red (FAILURE) arrow icon
  
:My opinion: move Future to Recent at midnight Hawaiiish time (yeah, I just said that so I could type three Is in a row), include any future, no matter how far, and five days of past. We haven't yet addressed the issue of archiving versus discarding past pasts. -[[User:Robyn|Robyn]] 18:10, 18 June 2009 (UTC)
+
USERTEXT is now preserved across runs.
  
Extra over the weekend is better, but not sufficiently needed that it shouldn't be abandonned if it turns out to be harder to do than you thought. Obviously you need extra future PLANNING pages over the weekend. -[[User:Robyn|Robyn]] 18:08, 19 June 2009 (UTC)
+
Added ability for multiple users/formats to be specified.  See [[User:Relet/Expeditions]] for proof. Users, expedition list pages, and formats need to be specified here: [[User:AperfectBot/User_expedition_lists]]
  
:The extra stuff in the future over a weekend is a givenI want to figure out a way to do it so that it obeys DOW holidays, so that part may be challenging.  I don't think the variable number of days in the past shouldn't be too bad.  --[[User:Aperfectring|aperfectring]] 18:57, 19 June 2009 (UTC)
+
==== 2009-12-02 ====
::Today's output, BTW: really good. Compare with my hand-done Current events. -[[User:Robyn|Robyn]] 19:12, 19 June 2009 (UTC)
+
* I have started work on creating per-user expedition listsCurrently it is making a list for [[User:Aperfectring|Aperfectring]] [[User:Aperfectring/Expeditions|here]].  The entries of this list are currently generated using the following format, but it is likely to change in the future:
::Mind you, I took five minutes to do it, and that included moving the section for the 18th to the past. -[[User:Robyn|Robyn]] 19:15, 19 June 2009 (UTC)
+
date DATE - gratadd GRATADD - gratname GRATNAME - people PEOPLE - location LOCATION - transport TRANSPORT - reached REACHED:Succeeded:Failed:REACHED - reason REASON - link LINK - exped EXPED - usertext USERTEXT
:::P.S. Do you have a decision/opinion about what to do with the old list? Archive on YYYY-MM-DD pages? (Pro:Already exist, colocates with photos. Con: Some people may think they are messy with the photos) Archive somewhere else? (Pro: no one can complain about you messing up their page. Con: ANOTHER set of pages, not with photos) Delete? (Pro: don't really need them, tidy, can be recreated easily Con: have to be recreated if you want to see them) -[[User:Robyn|Robyn]] 20:14, 19 June 2009 (UTC)
+
* Features:
::::Are you referring to the part of the list which would be removed from [[Geo Hashing:Current events]] when updates occur?
+
** The reached section is replaced first, so it is able to contain other substitutions seamlessly. It is used as follows: REACHED:1:2:REACHED  The text in '1' is used on reached coordinates, '2' otherwise.
:::::'''Yes'''.
+
** USERTEXT will be preserved across updates of the bot, allowing people to make their own comments about their expeditions.
::::If so, while it would be nice to keep it for a brief summary, I don't know how much value it addsIf we keep it somewhere, what do we do with it? Do we only update it rarely, meaning there could be stale information there? Do we never update it, meaning that red links could start showing up if a page is deleted?
+
** PEOPLE will contain all of the rest of the people who attended the expedition.
:::::I was thinking update it just before archiving, then '''no further updates'''.
+
** The use of a format string allows for great flexibility in how the final output is displayed.
::::Keeping all of the archives pages up to date with a summary list seems like it would be a bit intensive to do. I know it isn't the most user-friendly solution, but just discarding the list might be the best to do, but I am even on the fence about that. --[[User:Aperfectring|aperfectring]] 20:30, 19 June 2009 (UTC)
+
** Expeditions are sorted lexicographically by the expedition page name (YYYY-MM-DD LAT LON)This has the affect of being chronological by date, then pseudo geographical in the following order: Southern hemisphere first, Northern second. Then roughly equator to poles, though there will be some mixing. After that, it will do Western hemisphere first, Eastern second.  Then roughly prime meridian to "date line".
:::::I was on the fence, but your point about redlinks has pushed me towards the '''"discard"''' side. It's easily re-creatable, by any user, just by going to the Category:Expedition on (date) page. -[[User:Robyn|Robyn]] 22:06, 19 June 2009 (UTC)
+
* Caveats and current implementation holes:
 +
** Hardcoded to only update for Aperfectring.
 +
** Hardcoded to one specific page to update.
 +
** Hardcoded to the format noted above.
 +
** USERTEXT is currently not preserved across updates.
 +
** The page is completely rewritten each time, which prevents people from having non user list text before or after the user list.
  
Some new thoughts based on IRC conversations last night, mostly with Robyn.  All planning pages further in advance than when coordinates are available will be included in the list.  It will probably be easiest to transclude daily auto-generated list pages on a "current hashes" page, which leaves space for manual additions by users.  This "current hashes" page will then itself be transcluded on '''Geo Hashing:Current events'''.  I also think that if this is the method we would take, we could transclude the daily hashes page on the YYYY-MM-DD page within noinclude tags, so the lists don't show up on the monthly page.  While redlinks may show up in this list, I don't think it will happen too often, and after the page disappears from the "current hashes" list, it will no longer be bot-updated, so they could be fixed manually.  I would like people's opinions on this before beginning to implement it, because it will start creating more pages, and I don't want to annoy joannac.  --[[User:Aperfectring|aperfectring]] 14:11, 20 June 2009 (UTC)
+
==== 2009-11-03 ====
 +
* Put the bot into source control
 +
* Added a getSectionRegex function
 +
* Updated the getSection* functions to be able to operate on just the top level sections or all subsections with an option
 +
* Updated the ape to use the common GraticuleDatabase library
 +
* Updated the ape to use the getSection* functions from the library
  
==Positive feedback==
+
==== Older stuff ====
A comment from [[User:Norsemark|Norsemark]] on the Current events page to show you that your hard work is appreciated: "It's a great idea, it's motivating to see that others are planning and more likely to encourage others to submit theirs."
+
Go [[User:AperfectBot/Change_history|here]] to see older change history.
  
 
= EMERGENCY STOP SECTION =  
 
= EMERGENCY STOP SECTION =  

Latest revision as of 19:03, 19 March 2023

The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.
Eliezer Yudkowsky

This bot was created by aperfectring, who also maintained and hosted it until 2023. Since then, it is hosted by Fippe. It is implemented on top of pywikipediabot, and its job is to maintain the daily expeditions lists, as well as the expedition lists on Geohashing:Current events. It also is able to create both per-user and per-graticule expedition lists. See Help:AperfectBot for more information.

Much Thanks

  • The uses some code stolen from graciously donated by relet, and he also provided source control before the move to github. We have started to merge some code which we were sharing, to create a common library of useful geohashing wiki functions. We hope these functions will help facilitate easier development of new bots, and also improve the overall quality of all bots which use it.
  • Robyn came up with the original idea, and provided some great input early in its development.
  • Jiml hosted and ran the bot while I moved across the country and scrambled for internet access.
  • All others who helped with the planning and suggestions for improvement.

The bot wouldn't have been completed nearly so quickly, or gotten to be the dumb ape it is without all of you. Thanks.

How it works at the moment I edited this

It looks at Category:Meetup on YYYY-MM-DD for the most recent days, and all days where coords are available, and finds all pages in it which have a title that matches: YYYY-MM-DD lat lon

It also looks at Category:Expedition_planning for all pages matching YYYY-MM-DD lat lon where the date is further in the future than the latest available coordinates.

Go here for information about how it constructs the summary.

The bot takes somewhere around 10-15 minutes to complete one iteration. Most of this time is actually fetching and writing the pages, the actual processing time is rather minimal.

Updates to functionality

I'll try to keep this up to date with what changes I make on the Ape, but the best place to look is the github repo, as that will be maintained much more often.

2010-04-28

I added dynamic support for holidays, instead of the previous static one, which I never added holidays to. Holidays included are:

  • Geohashing Day
  • Mouseover Day
  • DJIA holidays (for the day they actually fall on, not when DJIA observes them)
  • Easter Sunday (Came for free with DJIA Good Friday holiday)
  • Pi Day
  • Talk Like A Pirate Day

The ape should be able to identify these holiday on its own forever!

Tasks Remaining

These are in rough order of importance

  • Review and possibly improve the fuzzy logic to the library user list function. (ongoing, for continuous improvement)
  • Improve transport detection.
  • Look into either signing up for access to a server, or convince one of our friendly neighborhood geohashers with a server to let me have access to update and run this, as well as the Notification program, from it.
  • Sort the results for each day using an undetermined key to sort on
  • Let AperfectBot eat bananas

Task scheduling

I will use this section to plan out my time in the evening on tasks. I will probably put in an hour or two of work on most weekdays. Anything from before 2009-06-17 is included for historical purposes. The bot is live! Anything I will be doing from now on is new features, bug fixes, or improvements to output.

2010-01-27

  • The window is now about 10 seconds each run, which is currently lasting about 6 minutes.

2010-01-26

  • Fixed user texts so that there is a much better chance they will be preserved. The window is still measurable, but much smaller than it used to be.

2010-01-14

User:Aperfectring/Expeditions is updating with a new format:

|-
|DATE||GRATADD||GRATNAME||PEOPLE||REACHED:[[EXPED|Succeeded]]:[[EXPED|Failed]]:REACHED||LOCATION

With the text before and after the update area, this results in a sortable table of expeditions!

Added some new options: PEOPLE:x - where x is any number - Will display at most x people in the list. Otherwise operates exactly like PEOPLE TRANSICON - Will display icons for all transport options detected. REACHICON - Will display either a green (SUCCESS) or red (FAILURE) arrow icon

USERTEXT is now preserved across runs.

Added ability for multiple users/formats to be specified. See User:Relet/Expeditions for proof. Users, expedition list pages, and formats need to be specified here: User:AperfectBot/User_expedition_lists

2009-12-02

  • I have started work on creating per-user expedition lists. Currently it is making a list for Aperfectring here. The entries of this list are currently generated using the following format, but it is likely to change in the future:
date DATE - gratadd GRATADD - gratname GRATNAME - people PEOPLE - location LOCATION - transport TRANSPORT - reached REACHED:Succeeded:Failed:REACHED - reason REASON - link LINK - exped EXPED - usertext USERTEXT
  • Features:
    • The reached section is replaced first, so it is able to contain other substitutions seamlessly. It is used as follows: REACHED:1:2:REACHED The text in '1' is used on reached coordinates, '2' otherwise.
    • USERTEXT will be preserved across updates of the bot, allowing people to make their own comments about their expeditions.
    • PEOPLE will contain all of the rest of the people who attended the expedition.
    • The use of a format string allows for great flexibility in how the final output is displayed.
    • Expeditions are sorted lexicographically by the expedition page name (YYYY-MM-DD LAT LON). This has the affect of being chronological by date, then pseudo geographical in the following order: Southern hemisphere first, Northern second. Then roughly equator to poles, though there will be some mixing. After that, it will do Western hemisphere first, Eastern second. Then roughly prime meridian to "date line".
  • Caveats and current implementation holes:
    • Hardcoded to only update for Aperfectring.
    • Hardcoded to one specific page to update.
    • Hardcoded to the format noted above.
    • USERTEXT is currently not preserved across updates.
    • The page is completely rewritten each time, which prevents people from having non user list text before or after the user list.

2009-11-03

  • Put the bot into source control
  • Added a getSectionRegex function
  • Updated the getSection* functions to be able to operate on just the top level sections or all subsections with an option
  • Updated the ape to use the common GraticuleDatabase library
  • Updated the ape to use the getSection* functions from the library

Older stuff

Go here to see older change history.

EMERGENCY STOP SECTION

Putting any text beneath the following header will cause the bot to stop running. Please only do so if the bot is REALLY misbehaving.

Distraction Banana