Talk:Main Page/Archive 1

From Geohashing
< Talk:Main Page
Revision as of 20:57, 24 May 2008 by imported>Tjtrumpet2323

This page serves as an archive for clearly Handled topics, as Talk:Main Page is getting a little cluttered.

md5 Collisions?

What about collisions in the md5 algorithm?

What about them? In theory, two dates/stock market prices might end up with the same meetup point. Unlikely, and probably not a terribly big deal? Zigdon 07:39, 21 May 2008 (UTC)
There are situations where cryptographic collisions are a problem, but this isn't one of them. md5 is just being used as a pseudorandom number generator here, and it has a very small-entropy seed (someone call Debian!). So its cryptographic strenth isn't particularly relevant. --Xkcd 07:43, 21 May 2008 (UTC)xkcd

OMG geohashing predicts future dow prices!

http://irc.peeron.com/xkcd/map/data/2009/05/27 --Ryan the leach 12:33, 21 May 2008 (UTC)

heh yeh i saw that one too... this is deep magic SinJax
Yeah, I totally wasn't testing stuff, <_< Zigdon 14:54, 21 May 2008 (UTC)

Wow, now you only need to break MD5 to get rich! 81.167.17.12 12:40, 21 May 2008 (UTC)

could be future meetup? --Ryan the leach 12:33, 21 May 2008 (UTC)

Fairly inconsistent if so, what if the dow is different. WHAT THEN?!?
It won't ;) --DarkRat
well the tool wont say its different :P --Ryan the leach 12:33, 21 May 2008 (UTC)

geohash.org

There's already a thing called "geohash" which is a way to represent coordinates with arbitrary precision in a computer and query-friendly format. It's at http://geohash.org/ and it was designed and implemented by Gustavo Niemeyer.

I don't see how that's relevant. We are talking about geohashing. It's a different word.

To decimal

...now, i'm likely just too thick to get it, but how does one convert back from the half md5 hash to decimals?

It goes like this... each digit is below zero but not 10^-n but 16^-n....so the first hex digit is 16^-1, the next is 16^-2 and so on. And it goes from left to right so in the hex number "8d" you get:

f = (8 * 16^-1) + (13 * 16^-2) = 0.5507

i think anyway :) - SinJax


Or, interpret the bytes from md5 as two big-endian 64-bit integers, and divide both by 2^64.

Chicago area

In the northern Chicago suburbs, the vast majority of nearby GPS areas according to the implementation are dangerous pools of dihydrogen monoxide. Is there some workaround?

Chicago area residents could just use which ever of the four locations makes the most sense. (See the Chicago page.)
Believe Adelaide in South Australia Has a similair problem but nowhere quite as severe.
dihydrogen monoxide should be avoided at all costs, if you're careful, you could go to the nearest point at the shore of the dihidrogen monoxide pool, but be sure to not ingest any.
While i'm probably the only Milwaukeean who is going to be running around and playing, my region is about 80% Lake Michigan. No good.
My region is a similar amount English Channel and Solent. I guess this just makes the occasional on-land meetup more special - perversely, I think it might actually *keep* people's attention longer. 81.187.153.189 18:45, 21 May 2008 (UTC)

xkcd.com/geohashing Problems?

Hey - I can't get to [1], and none of the graticules' maps are working - is this just me, or the rest of the world, too?

Yes, irc.peeron.com is currently down (as of 20080523 11pm PDT), I know, I'm trying to get hold of someone who can do something about it. Zigdon 07:29, 23 May 2008 (UTC)
Why is the the xkcd.com/geohashing redirecting to irc.peeron.com anyway, shouldn't it be moved to xkcd? I think too many active graticulees are melting the server. --Opspin 08:07, 23 May 2008 (UTC)
it's working again!

Difference between the comic and the generator

Zigdon, xkcd:

The hash in the comic for 2005-05-26-10458.68 starts with db9318........

Here's debugging info for the map page's calculation of http://irc.peeron.com/xkcd/map/:

Graticule: (37, -123) - (38, -122)
Market open on 2005-05-26 = 10458.68
MD5(2005-05-26-10458.68 ): 357e5cac889681628fdd754c1a235919
Split: 357e5cac88968162, 8fdd754c1a235919
offset = 0.20895938122029104, 0.5619729338451526
37.20895938122029 -122.56197293384515

On my machine: $ md5 -s "2005-05-26-10458.68"
MD5 ("2005-05-26-10458.68") = db9318c2259923d08b672cb305440f97

Any idea what's going on here? Is that an extra space on the end there, Zig?
--FunkyTuba

   I've found a bunch of opening values at http://irc.peeron.com/xkcd/map/data/
   with carriage returns on the end, which I dont think were being stripped out before
   hashing. The website went down before I could fully confirm though.
   --ZorMonkey 11:49, 23 May 2008 (UTC)
That was indeed the case, fixed now. Zigdon 14:07, 23 May 2008 (UTC)

DOW source

What is our source for the DJIA figure? I only ask because http://irc.peeron.com/xkcd/map/dow.js has "data['2008-05-20']=13026.04", and http://irc.peeron.com//xkcd/map/data/2008/05/20 also says, "13026.04"

However, WSJ (http://online.wsj.com/mdc/public/npage/2_3051.html?symbol=DJIA), usually reliable for this kind of thing, has an opening of "12,958.06" for 5/20/2008.

I could be misinterpreting something. Mattflaschen 13:27, 21 May 2008 (UTC)

And http://finance.yahoo.com/q?d=t&s=%5EDJI does have, "13,026.04". Mattflaschen 13:28, 21 May 2008 (UTC)
We get the data from finance.google.com Zigdon 14:55, 21 May 2008 (UTC)

I think this may help explain the problem with opening values. Perhaps the closing value should just be used as that seems a bit more definitive -- someone may want to check the sources named in this section to see if they agree on closing value. If the closing values of all three of Dow, Nikkei and Euronext were summed, this would still allow for surprise most mornings of the week worldwide -- your surprise would just be based on the closing of a market partway around the world. See #Time_Zone_Discussion for discussion of combining multiple indices. —Christian Campbell 19:32, 21 May 2008 (UTC)

Well, I don't know. The article says, "the posted opening price on the Dow will be close to the previous day's closing price (which can be observed by looking at Dow price history) and will not accurately reflect the true opening prices of all its components." But that's a criticism that the posted opening is meaningless (from an economic point of view). It doesn't necessarily imply that different services post different openings.
The three (Google Finance, Yahoo Finance, and WSJ) do agree on today's closing price. However, I recommend we stick with opening and just state a definitive source. Mattflaschen 20:57, 21 May 2008 (UTC)
Okay. The statements above seemed to indicate that different sources post different openings, and I thought the article might explain that.
Regardless of opening vs closing, the definitive source solution seems helpful. —Christian Campbell 21:42, 21 May 2008 (UTC)

I realize that this is just for spontaneity, and that's great, but lest someone decide to use this for something serious, it ought to be noted that the entropy of the DOW isn't very high: the biggest change ever over the course of a single day is -684.81 (http://www.djindexes.com/mdsidx/index.cfm?event=showavgstats#no4), which is about 16 bits. Most daily differences are much smaller, around half that many bits. It's very amenable to a dictionary attack. --Mike Stay

True... adversaries could prepare to target the 32768 likeliest meetup sites tomorrow. I'll risk it.  ;-) —Christian Campbell 19:21, 22 May 2008 (UTC)
The entropy of the DJIA doesn't actually matter. A single digit change is enough to toss you completely to the other end of the graticule, since the MD5 would be *completely different*. Zigdon 23:49, 22 May 2008 (UTC)

Not just for games

This idea interests me but not necessarily just for games. My only problem is the use of the DOW. Say you maintained a group of people who you want to secretly convene at certain times and locations. The times are all set in advance. You don't want to transmit the location itself since it could be intercepted. Replace MD5 with your own convoluted scheme. Transmit the keys to that algorithm online, through numbers stations, whatever. The point is that you can select their location in advance, compute the key, transmit the key, and nobody would be able to figure out what it meant. If they did know it associated to a geolocation, then they would have to crack your algorithm.

You don't need to replace MD5. Just arrange on a password, and concatenate it with the date and number before doing the MD5. Then you can publish the number and even the method, and your location won't be determinable. -- lilac

Planning Ahead

I tried creating a geohash for june 26th, but it seems you are going off of market data, because it says "no market data for 6/26".

Um, yeah, that's rather the point. Did you actually read the cartoon? The reason for using something unknowable in advance is so that one does not know in advance (except at weekends) where each day's location will be.
Its about spontaneous fun adventures my deary! Just let go, find out where things are on the 25th and make it happen. see you there! SinJax

Map datum?

One thing left unspecified so far is "What map datum should the coordinates be interpreted in?" I suppose that everyone is using the same thing Google Maps uses, by default. Having searched for what it is, I see a lot of people speculating that it is WGS84, though the Google Maps API doesn't say so. --Recursive 18:31, 21 May 2008 (UTC)

It is. (as defined by the recent EPSG standard (that's actually based on MS VE, but Google use the same thing)) --Edgemaster 17:28, 22 May 2008 (UTC)

Europe Time Zones problem

Official Solution

The official solution is the 30W Time Zone Rule, which has been implemented into the map program and will begin affecting coordinates east of -30° longitude from Tuesday, May 27, 2008. The discussion is shown below for archival purposes.

Proposed Time Zone Solution

It seems there are two main proposed solutions. One is to use yesterday's stock openings for regions where the NYSE open is too late in the day, and the other is to use the Nikkei/London for east Asia and Europe respectively. We'll make a decision on that later tonight after reading over all the discussion. If you have an opinion, voice it here. --Xkcd 09:50, 21 May 2008 (UTC)xkcd

Update: Okay, here's my proposed standard: It seems like the simplest, least break-y solution is to simply consider any Dow opening published after noon local time to have happened on the next day. That is, whatever location you calculate at noon using the most recent DOW opening and the current date is the final location for that day.

This is a reasonable interpretation of the original comic, and will only require a small change to the online tool. It means that for everyone in the US, there's no change. Everyone in Europe learns the next day's coordinate the afternoon/evening before. And everyone in east Asia learns it sometime overnight. It also means that for people in Europe, Friday and Saturday will use different openings for their hashes, but Saturday will be known by Friday afternoon. It also, as far as I can tell, means the current calculated location for Saturday doesn't change anywhere in the world.

This would be a problem for places where the local time on one side of the graticule is before noon at the opening and local time on the other side is after noon, but as far as I can tell, all these locations are in the Atlantic Ocean and relatively uninhabited.

Any thoughts? If there are no objections by this evening (EDT) I say we make this the official standard and write it into the reference implementation.--Xkcd 16:16, 23 May 2008 (UTC)Xkcd

Rather than dealing with messy definitions of "local time" (which are a programmer's nightmare), couldn't we specify a longitude, say -30, east of which the previous day's data are used (with apologies to Greenland, of course)? Tim P 16:33, 23 May 2008 (UTC)
That also sounds reasonable. Votes? --Xkcd 16:44, 23 May 2008 (UTC)Xkcd
Yeah, I'm ok with the -30 line. I'll just make sure the tool makes it clear that this rule is applying, and allowing it to be overridden. Zigdon 17:04, 23 May 2008 (UTC)
Pro. This makes the last problem go away, and it doesn't really matter for Greenland anyway, one part of Greenland just knows its location sooner than another part. --FrederikVds 17:19, 23 May 2008 (UTC)
I suppose technically you'd also need to specify an eastern bound. Longitude ±180 would be fine, except then you have to apologize to people in extreme Eastern Russia for not thinking of the International Date Line. I guess those people east of ±180 would just have to use the previous day's coordinates altogether, since they're technically at a longitude where it should be the previous "day". --Tim P 19:29, 23 May 2008 (UTC)
I should add that when the NYSE opens at 09:30 EDT, the mean solar time is 11:30 at that longitude, and it's 12:30 when the NYSE opens at 09:30 EST (November-March), both of which are very close to the "noon" solution of which you speak and much more consistent. Tim P 16:36, 23 May 2008 (UTC)
I don't understand what you're getting at. We don't *want* it to be close to noon anywhere inhabited when the Dow opens, since there will be ambiguity there. --Xkcd 16:44, 23 May 2008 (UTC)Xkcd
I was more so specifying that my -30 longitude solution is roughly equivalent to your "noon" solution, and is actually less ambiguous since it doesn't rely on a shaky definition of "local time." I guess the words just came out wrong. --Tim P 19:01, 23 May 2008 (UTC)
Besides, who's to say what "local time" even is in international waters? --Tim P 19:02, 23 May 2008 (UTC)
Good solution. I can see only one problem: the Monday location will be known on Friday, which is a bit early. But it's probably the best solution. --FrederikVds 17:19, 23 May 2008 (UTC)
But say, in Australia, it won't be known until very late on Friday (sometimes after 00:00 Saturday depending on who's in DST). Effectively, such a solution "redefines" the weekend for those east of -30 as Saturday to Monday, which in many regards is just as reasonable as the Friday to Sunday definition west of that longitude. Besides, there are more Monday bank holidays anyway. --Tim P 19:38, 23 May 2008 (UTC)
Wouldn't it be better to have everyone using yesterday's stock exchange instead, so that everyone has the same co-ordinates on the same date? This way we're going to have two sets everywhere. Nicktaylor 20:09, 23 May 2008 (UTC)
But then we wouldn't be backwards-compatible.  :) That isn't much of a sticking point, but still, it's important to some. --Tim P 20:16, 23 May 2008 (UTC)
Surely half the world is going to be non-backwards-compatible anyway? Nicktaylor 20:20, 23 May 2008 (UTC)
I mean non-backwards-compatible with many of the original excursions and the comic (all of which, calculated before this site went live, are west of -30 lon. Besides, the -30 lon rule would effectively just "reassign" our version of the International Date Line (for DJIA purposes) to -30 lon. --Tim P 20:38, 23 May 2008 (UTC)
Personally, flagging a dozen or so early excursions as "old algorithm" seems a small price to avoid having two sets of coordinates for every day. In the long term it will be far less messy.
If we have to have two sets, surely it'd be better to use a different market index for the East, Nikkei is probably best, in order to maintain minimal predictability. Using yesterday's Dow Jones I'll be able to see coordinates at 2.30pm the day before. Nicktaylor 20:47, 23 May 2008 (UTC)
We will not have two sets of coordinates for every day. We will have one coordinate per day per graticule, same as before. Defining openings which are too late in the day to count toward the next day, by effectively moving the international date line, seems like the best solution. It preserves the fact that North Americans learn the times the same day they happen (for fair races, should that become a thing), it preserves the example in the comic and every North American implementation, and it keeps the Saturday meetups the same as before (which is a feature of both proposed systems).--Xkcd 01:46, 24 May 2008 (UTC)
I vote that if we go that route, someone create an "old algorithm" template. That seems like a fairly reasonable solution. I'm still in favor of the -30 lon solution, because our version of the "date line" has to occur somewhere. --Tim P 21:00, 23 May 2008 (UTC)
That sounds like the fairest, neatest and most International solution to me. Using yesterdays data for everyone also means early-rising American geohashers will get the same location as an afternoon geohasher. Nicktaylor 21:04, 23 May 2008 (UTC)
I'm not sure how the -30lon solution (coining a term, I'm sure) gives different results in America depending on when one checks it. Since all of the US (ignoring the Aleutian Islands) are west of -30lon, they would use that day's opening price, which is released at 09:30 Eastern, 06:30 Pacific, etc. So, the East Coasters will have to wait until later in the morning than the rest of the country. So what? That's how the algorithm was originally defined, and spontaneity is part of what this project is about. --Tim P 21:33, 23 May 2008 (UTC)
  • To summarize my endless blather here: The proposed solution which Randall seems most in favor of (and I am not a Randall) is what I think should be called the -30lon solution. That is, all graticules between -30° and +180° longitude will use the previous day's Dow opening with the current date in their hash. Any users who are anomalously "between" 180° longitude and the International Date Line (such as Russia's Kamchatka Peninsula or Kiritimati Island) are cordially asked to use the date that corresponds with their longitude, not their time zone. And of course, apologies to Greenland for splitting them up. --Tim P 02:09, 24 May 2008 (UTC)

Time Zone Discussion

Even if different regions were to use different values, users in countries and individual graticules/supergraticules will still define meeting times according to local sensibilities, and sometimes will necessarily use the penultimate value when they need more time. Multiple values doesn't really solve this, and makes definition of local convention more complex (deciding whether to use last closing value is tied in with which index to use, dealing with nationalism/regionalism...). Summing a few market closings (see #DOW_source for discussion of opening vs closing values) leaves local convention to a simple definition of how recent a value to use (most recent, second most recent...) and what time/timezone to meetup. Plus, using a common value is just cooler. —Christian Campbell 19:49, 21 May 2008 (UTC)

9:30 AM EST is 04:30 PM CET. So for us europeans it's half an hour or even 1½h after meet up times... :(

So what's a sensible Europe policy? If we can come up with something reasonable we can certainly adjust the tool. (The time zone stuff always makes my head spin a little).
Perhaps we should have anyone east of the Atlantic, up to a point, use the previous day's stock opening. Anyone in Europe have any thoughts?
I'm sorry for missing the point, but 9:30 AM EST on Friday is, by my calculation, 13:30 UTC, so with daylight saving taken into account: 15:30 (3:30PM) in central europe, on Friday. That would give us more than enough time to organise meetups on Saturday, right? The question is: do we do meetups at 3 PM local time or EST? - DWizzy
I vote for using the Nikkei/FTSI

Indeed, how inconsiderate. We could, ofcourse, just say we use the previous day coordinates. Another option would be to use the numbers of a local stockmarket, but that could be a problem here, in the Netherlands, where I live. The country is quite small, and the chances of the meeting point being in one of the neighboring countries (especially Germany) are significant. However, if the Germans use their own stock market, they would have a different meeting point, possibly even in the Netherlands. Since it doesn't really matter which stock market we use, nationalistic feeling aside, we could just pick any one of them, or use the sum or some other calculation of various stock markets. -Sparky


Well, we want to pick something as simple and universal as possible. Some kind of EU market sounds reasonable; I hadn't really thought about that. It'd be nice if there were an elegant global solution that just involved a simple change ... if people here can get a consensus, we'll change the tool for handling Europe better. --Xkcd 07:36, 21 May 2008 (UTC)xkcd
Ideally, the global solution would be to use a market as far east as possible, so that everyone has the co-ordinates in time. But this happens at roughly the same time that the Dow closes the previous day, so that might be easier. It would give people in the western hemisphere a longer time to plan their trip than those in the east, but even New Zealand would get the number in time. -- Zephyr
That could, ofcourse solve the problem, but it's basically the same thing as using the previous days location if you're east of the Atlantic. - Sparky
Of course; but the pedant in me dislikes the idea of having a different algorithm depending on whether one is in the Americas or not. Actually, I suppose it isn't, if only the javascript interpreted "most recent" correctly :) -- Zephyr
Maybe just enter full time with local timezone as a reference time. I mean in the webapp you enter your local, timezoned time. The algorithm converts it to the stock market timezone and counts the latest opening before that. Geohashers around you are more than likely to use the same timezone, as you, so they will get the same location. No problem. Just enter Saturday, 4PM CEST (or whatever TZ you live in now) and you get *some* *predictable* stock market opening. - Tadeusz
Unfortunately, there's no way to have a real "global" solution when using a local stock market. Some random number source from space would work, but would be boring, and too hard to discover. I like the stock market idea. How about this:
  • Hawaii to Newfoundland (UTC -10 to UTC -3.5) uses the Dow Jones
  • Greenland to India (UTC -3 to UTC +5.5) uses Euronext 100
  • Bangladesh to Micronesia (UTC +6 to UTC -11) uses the Nikkei 225
Calculating the local timezone is an exercise left to the reader... -Lucky
I agree about the no-real-global-solution point and in liking the stock market idea, but I don't see why random numbers from space would be boring. The md5 hash deprives the stock market index of meaning -- the only value is the fact that the number doesn't come into our knowledge until a predictable time in the morning. Maybe we can think of something intrinsically local per location? Like the opening value of your country's currency? —Christian Campbell 09:42, 21 May 2008 (UTC)
Like I pointed out before, that would cause new problems near country boundaries, but it shouldn't be a problem in Europe, since most of it uses the Euro anyway. It might be better than choosing any particular stock exchange in Europe, since it's more neutral. - Sparky
I picked Euronext because it's France, Netherlands, Belgium, Portugal, and the UK (but not LSE). But how about this:
  • The most recent opening of the Dow Jones, Nikkei and Euronext multiplied together. The "when" still needs to be taken into account, so say 10am New York for the Americas, 10am London for Europe/ME/Africa and 10am Tokyo for Asia. -Lucky
Great idea, Lucky; I was actually thinking of something like this as I drifted to sleep last night. This neatly keeps the benefit of the coordinate not coming into being until a certain time, but allows three such times per day; is a global solution; and still leaves weekends with a long period of stability (assuming Nikkei and Euronext are closed on weekends -- I'm not familiar with them). Local meet-up times will still be a matter of local convention, for various definitions of local.
One caveat: I'd say they should be added rather than multiplied. Adding would make it insensitive to different ways of multiplying.
See #DOW_source for discussion of opening vs closing values. —Christian Campbell 19:20, 21 May 2008 (UTC)
Lucky, perhaps I misunderstand your "when" point. I think you need to rely on a time when the value stops fluctuating and can be agreed on (such as closing). So I don't suppose you meant to look at a market during a time while it is open. If you meant to hold off performing the calculation until 10 o'clock with a value from an earlier time, I don't see the point, and anyway people are going to perform the calculation as soon as the inputs are available. Besides, that's half the cool factor: jumping on it. =-) —Christian Campbell 20:08, 21 May 2008 (UTC)
there are currently, I believe, 6 'areas' covering The Netherlands, but we could always decide to rotate fairly between those coordinates - with, as a bonus, we have a backup location when one is unreachable. I'd say, leave that to the locals. -DWizzy

I am in favor of the "use the DOW of the previous day" solution. This makes the tool simpler, and the point distribution around the globe uniform. - Sec

I'm not quite sure what to make of it. For Central Europe (I'm Dutch myself) and everywhere else I can't see why we wouldn't just pick a stock market on twelve hours distance from the Dow. As for Dutch meetups I'm all for cycling through the different areas as to get more people at one meet. Nazgjunk 09:24, 21 May 2008 (UTC)
No, honestly. You want to meet at Saturday, 4PM UTC (or whatever european time zone)? Get the latest Dow opening before Saturday, 4PM you-local-time. Just works.
Problem is that the sample calculator uses this page to grab the data of the Dow Jones. And that one will only show the opening price after the Dow has closed already. At least, according to my clock the Dow Jones is already open, and it still only shows the data of May 20th. - MadJo
Using the value available at the time of the calculation looks the way to go

Why not use the Dow value from (meetup_date - 1 year)? That lets you generate a meetup spot up to 364 days in the future (leap year pedagogues, keep it to yourself), and time zones become irrelevant.

Because not knowing the meetup time until shortly before it happens is a very important feature, not a bug. Otherwise why use the Dow in the first place? --Xkcd 01:47, 24 May 2008 (UTC)

Is the DOW/timezone thing a problem for Saturday meetups in Europe? Saturday's coordinates are generated from Friday's DOW opening value, which is available at 0930 EDT (1430 BST in the UK) on the Friday. So there's over 24 hours notice for the meetup coordinates in Europe. - Ytaya

Ehm, you have a point here. We seem to have collectively forgotten that the Saturday calculation uses Fridays numbers. - Sparky.
Yeah, that works for Saturdays but what about other days? Maybe the thing with the DOW from the previous day works, most news stations report about that anyway. - Kiwi
I'd suggest only doing the meeting on saturday, to increase the chances of meeting, possibly even with a smaller grid to reduce travel distances, and most people have to either work or go to school at 3PM on weekdays anyway. - Sparky
If we're only likely to meet people from our own graticule (and possibly neighbouring, watery graticules) then we only need a local convention. I agree with Ytaya, in the UK at least the Friday DOW opening for a Saturday meet 4pm local time or the previous day's opening for a week day seems to retain the elegance of the system. - Nick


I can't believe we are actually arguing about this. It says so in the comic itself: "That date's (or most recent) DOW opening". So, if the DOW hasn't opened yet, you use yesterday's opening. Simple as that. Why complicate it further? Arj 13:33, 21 May 2008 (UTC)

The problem is the specifics around your phrase "hasn't opened yet". Given a region where the figures become available half an hour before the meetup time, then either people take the previous day's value and meet up in a location which is technically incorrect by the time of the meetup, or you rule out anyone who cannot check the value, plan a journey and travel to the location within half an hour. To me, the obvious solution is that the value used has to be defined as the value that was available at a fixed offset from the meetup time - for example six hours before. But that complicates the JavaScript that pulls the figures, as mentioned above... --cfm

That would still leave some ambiguity for the area around the timezones that are 6 and 7 hours before EST, because the time of the meeting depends on which timezone the meeting takes place in. Again, there could be 2 meeting in different places, 1 hour apart within the same region, of no meeting at all. This could happen on every single day except Saturday, and thus would happen quite frequently. - Sparky

I guess the reason to use stock exchanges is that those numbers may be available to everyone even without the internet. If we don't care about people who don't have internet, it seems most logical to just make the server grab a true random number from somewhere every day or interval and use that. Then we are in control of the situation and can set it up however we want. - forest

Why not use yesterday's DOW opening *everywhere*? This would have the pleasing result that you could have one destination for each midnight-midnight day and there would be no time when it wasn't defined. - Sarah


How about there's a fixed time at which the algorithm is run, for example 10am local time. That way it's standardised as to whether you need to take the day before's or today's (i.e. has it passed 9:30AM EST yet by 10AM in your local time). This also means everyone has at least 6 hours notice before meeting, so we don't get the issue like in the UK where it's available at 2:30pm and I'd struggle to make places in 90 mins given I go by train and bike everywhere. If we stray too much into different stock market figures or different timings for things, I'd fear we'd split the community over which version to go with and people go to different places on the same day, missing each other. I think a standard and an agreement is important above all. --AvengerPenguin 12:52, 22 May 2008 (UTC)

If the coordinates can be calculated further in advance (ideally a week, since most meetings will be weekly and most people wouldn't have time to attend more) people would be better able to schedule meetings. UK train tickets are cheaper when bought in advance. I also second comment below mine.-- waq (forgot to sign the first time)

I notice this still hasn't been resolved, so I'll suggest a further simple solution. If everyone everywhere just replaces "today's Dow Jones opening" with "yesterday's Dow Jones closing" then everything works fine. 131.111.202.214 15:09, 22 May 2008 (UTC)

Algorithm shortcomings: are we missing the point?

A significant number of people are commenting on "shortcomings" in the algorithm. e.g. some cities are split between over multiple graticules, meet up points over water, the Dow not suitable because of time zones, etc, etc, etc......

While from a geek standpoint I admire the pursuit to "improve" the algorithm, I think its missing the point. My interpretation of Geo Hashing is it’s about a chance to do something out of the ordinary, make our lives a bit weirder, and maybe even meet and communicate with your fellow human beings. The algorithm itself is just a starting point for these activities. Even the "perfect" Geo Hashing algorithm isn't going to make these things happen unless we step outside the front door.

--Lowman 00:00, 23 May 2008 (UTC)
Absolutely - and it's absolutely geekily cool that we're taking a numerical contruction (latitude and longitude) and laying it over the physical and human geography, and saying - "no, sorry, we're sticking with th numbers..." Then, there's the fact that I'm doing something here in rural Norway that you're doing in - wherever - Chicago, Portland, London, whatever. So please, stop trying to break this thing. If you want to organize something centred on your city, or optimised for your life, join the Kiwanis. (not to degrade the Kiwanis - they're cool, just they're not GeoHashing). AshleyMorton 00:17, 23 May 2008 (UTC)
Lowman is spot on. I completely agree. It's supposed to be random. "Fixing" and "Improving" something like this just detracts from the core purpose: to go with the flow and have fun. Adding constraints just weighs everything down. That said, it's gotta be annoying for those residing on tiny islands. When the coastguard picks you up from an offshore reef on Saturday, tell your rescuers "the internet made me do it". - Somersault 03:24, 23 May 2008 (UTC)
I'll agree here, so what if half the graticule is over water or in another country, wont that just make it more special when you do finally get to go to one? --Zorg 07:57, 23 May 2008 (UTC)
Another vote of agreement here. For the vast majority of people, if they really want to go to a location, you can travel to the next graticule over, which won't be more than a couple hours away. And even if you can't do that, one in every 20 or 25 times it will be in a decent location. If you want more than that, start geocaching or go on a site where humans have picked the locations. The fun of random is that it doesn't always work. Who wants to meet every week or go to a location every day anyway? It would get old way too fast if it was too easy to do. --Cahlroisse 08:05, 23 May 2008 (UTC)
Second, third, whatever. If it doesn't work - over water, private property - then don't go. No-one's forcing you to. Gormster 15:31, 23 May 2008 (UTC)

Saturday Times

Um the blog post says the meet ups are at 4, the wiki says they are at 3, is one time more official?

We were still going back and forth and the wiki hadn't been updated. For now, 4:00 -- closer to dinner :) Xkcd 06:55, 21 May 2008 (UTC)xkcd

What happens if the meeting point and me end up in different time zones? This is a particularly serious problem for those who live near the International Date Line, where it deviates from the 180° meridian. I guess the most natural way would be to define that the meeting time should be 4:00 according to the destination's local time, but as far as I can see, this is nowhere defined --62.20.90.195 10:54, 21 May 2008 (UTC)

Well, if you arrived according to your local time, people from either side of the International Date Line would arrive at different times. So it seems sensical for it to be the meeting time according to the destination's local time, as you said. -- 212.219.57.58 11:55, 21 May 2008 (UTC)
That still leaves a problem near the date line: there could be two consecutive Saturday meetings, one on either side of the date line, or there could be no Saturday meeting at all, because one day, the meeting point could be on the side where it is Friday, and the day after, on the other side, where it then is Sunday. The problem is that the date is not clearly defined in a region that is divided by the date line. I doubt it will be a problem in reality though, as the date doesn't seem to run over land, so any region that it runs though will have one side that is completely in the ocean. If we'd just agree to use the local time at the meeting point as the time reference, the whole time line problem is solved, I think. - Sparky
Um, the IDL deliberately avoids crossing nations internally, and I suspect that there aren't that many corner cases where the differing time zone becomes a problem. 192.43.227.18 14:10, 21 May 2008 (UTC)

Perhaps it would be a good idea to have the meetings at lunchtime (say, 12:30 local time), so that people who have to work on weekdays (or Saturday, for that matter) can still go to meetings during their lunch break if it happens to be close enough to where they work. - Sparky

What about weekday meetup times? Should that also be 1600 since people who work won't have time to attend weekday meetings anyway. If I had to set a meetup time, I'd say 6:30pm, which gives enough time after work to get to the place and isn't too late. --waq

Nearby Points?

I forgot to mention that the algorithm doesn't necessarily find the point closest to your present location. Ofcourse, the user can manually select the other regions around his location to find the closest meeting point, but it would be nice if the google earth app would show the 4 closest locations. That way, the chance of meeting different people is greater, because any user would be in 9 regions instead of just one. This could also be useful if the closest point in unreachable, where the user could choose to go to the next closest meeting point. - Sparky

Well, it depends ... generally you'll have one population center surrounded by fairly empty areas. I think we'll be able to better answer this question after a week's experience. --Xkcd 07:46, 21 May 2008 (UTC)xkcd
That, also, depends on where you live. The distance between two closest points horizontally or vertically is approximately 111km. The Netherlands is barely 200km wide. Especially in the western part of the country, which is the most densely populated, cities are rarely more than 15km apart. - Sparky
The thing with having one region is that you get to meet with the same people each time, which could be a good thing - since otherwise, you don't know who you're going to see again. (Though that could be seen as a plus, of course.) My region includes most of London, so in my case I'm probably going to get more people coming to the meet in my region than I am if I was to go to whatever was the closest on the particular day. - Kira. (74.52.15.98 13:16, 21 May 2008 (UTC))

This is most excellent, I know I'm participating! -Sir_Lewk

Would it be possible to include the route-planning feature in the map? For those without navigation systems, that would be very useful. - Sparky.

I'm from Australia, is there any chance of this being implemented in the Pacific Region as well? It sounds like a great idea. - Zorg

This does work for the Australian Region. I'm in Sydney, as of July I'm getting on board --Phraedus 10:20, 21 May 2008 (UTC)

Two thoughts for improvements:

  • Maybe use a smaller step size ? If you do not own a car then the destination can be unreachable. Also in Europe (where i am) the polpulation density is in general higher, so one grid cell might encompass two major cities.
We discussed this quite a bit -- currently, the average driving distance is something like 45-50 miles. But if you're trying to have an xkcd meetup for your city, it'd only compound the problem of cities being split up. Half your friends would go to one and half to another. It's already bad enough in DC and Chicago. --Xkcd 09:12, 21 May 2008 (UTC)xkcd
This, ofcourse, strongly depends on the population density (or actually, the XKCD reader density). If you make the regions smaller, the chance of 2 or more people meeting there decreases. As people fail to meet, they stop trying, further reducing the change of a successful meeting for those who try at a later date. - Sparky.
And L.A., Pittsburgh, Philadelphia, Denver... - Cahlroisse
  • 45-50 miles is too much for nonprofessional cyclists, are we encouraging petrol burning? Too bad for the ecosystem :P 80.36.82.38 10:01, 21 May 2008 (UTC)
I agree with your point about encouraging petrol burning, however, at some (maybe most) places, such large grid size is needed to have any significantly non-zero chance of meeting somebody. The problem seems to be that there is no clear one-size-fits-all grid size. Implementing a variable grid size sort of eliminates the elegance of the algorithm. Maybe this is just a stage we have to go though; you come up with a simple idea, which grows ever more complicated as it's developed, until you end up with a simple solution - Sparky.
  • I'd propose a cell size proportional to the population density. I agree it undermines elegance and simplicity, but sometimes (most of the times?) simpler solutions are just not satisfactory. Besides, seeing the cell sizes changing along with the local population density has a coolness of its own. --80.36.82.38 14:35, 21 May 2008 (UTC)
The ecological implications of petrol burning aside, i just don't have a car ;-) But I agree with you. To allow actual meetings to happen, you need a big grid size, especially in the beginning. -Mucki
I don't have a car either, because I far prefer a motorcycle. I don't really see any options that would enable the use of public transportation without sacrificing to many other properties of the proposed system. I'm afraid the only thing you can do is wait until the meeting point is within a reachable distance. If you can travel 20km in any direction (by bicycle, for example), about 9.8% of the meeting points will be within your range. - Sparky
Perhaps the solution should include people developing their own cell sizes on the city pages to allow for better distribution based on population. For example I can see that there might well be demand enough for both a San Francisco city based cell (based entirely within the city limits) and a larger cell that would include the surrounding areas. At present the cell for SF includes the penninsula far more than it includes the city and has a rather large amount of water as well. Since we have such a high degree of population density, however, it would likely not be a problem to cause such splits. Clearly the issue that's really raised here is that a one-size fits all approach based on pure geography without regard for population is not necessarily the best one. - Belgand
  • I think the algorithm should include the base coordinates (the integer portion of your gps location, or maybe even some secret number?) in the hash, otherwise the meetup points form a regular grid, offset by a fixed value from the base coordinates. - Mucki
I don't really see why that's a problem. In fact, it has a couple benefits -- if one coord is all the way to the north of a graticule and you're at the south end, it means the coord in the graticule to the south will be close to you. --Xkcd 09:12, 21 May 2008 (UTC)xkcd
I'd agree; I don't see what's wrong with a regular grid of possible meeting locations, while the upside is that the nearest location is at most 1.4 * 111 / 2 = 78.5km away (half the diagonal of a 111 * 111 square), assuming it is reachable, ofcourse. - Sparky.
I just thought I should add something: this is, ofcourse, only true on the equator, because the distance between the medians reduces as one moves away from the equator. - Sparky
Well, it is not really a problem in itself, but I find that a very regular spacing of the points somehow diminishes the beauty it gets from being otherwise completely random and unpredictable. -Mucki
I would agree with that, however, this does guarantee that there is a meeting point relatively close to you (regardless of reachability).
Except if you live on the Greenwich meridian, and then occasionally the shortest distance will be an entire box-width west or east. And if you live in the N-S centre of a graticule, this maxes out as the diagonal of a 111 * 111/2 rectangle, or 124km. (sucks to live at 0,0... in the ocean south of Ghana...) In the UK, it's more like 111 * 67, an 87km diagonal.

Oddly, where I live (Boreham, Chelmsford, Essex, UK), apparently whatever day it is, I should be meeting in the exact same place in the Thames estuary (I've tried about 5 different dates). Is there something amiss?91.107.26.104

It seems to be working fine. Different coordinate for every day this past week in the 51, 0 (east) graticule that contains Chelmsford. Any more details on the problem? --Xkcd 09:24, 21 May 2008 (UTC)xkcd

hi there, about one thid of my sector is covered with water. are you going to do any sea/ocean detection?

Could I suggest adjusting the main wiki page to say if your meetup is covered by water, click on the next closest square away from the water? Might solve some problems --Phraedus 10:20, 21 May 2008 (UTC)

This may have been covered already, but i'm confused about how the center of mass calculation is working. No doubt a cost function of some kind, but the question is why land mass isn't taken into account, or so it would seem! For example: where i am (Southampton, Hampshire, UK) the center of pass seems to be somewhere in the ocean. Now i realise this is probably due to the isle of white right below southampton. Possible solution, if the area covered by the center of pass holds more than...i don't know...40% water, adjust some variable to force a reduction in granularity. Mainly because the geohashes for the past 4 days have been in the middle of the sea and the reason is clearly just the odds given the amount of water in southampton's center of mass. Also, i've got a python implementation of this appy going, but i don't have access to the center of masses, though this might be me just being silly. Are these available somewhere in a handy dandy XML format?, cheers - SinJax

The squares are the zones bounded by latitude and longitude lines, nothing more. This means the configurations are sub-optimal for some cities, and there will be various conventions for dealing with those.
Oh i see! Talk about over complicating things in my mind :). And the lat/long for each square is the center of the square i assume? or what? Also is it cool if i use the dow data avaiable at http://irc.peeron.com/xkcd/map/data/ for my python implementation directly? or should i contact another webservice somewheres? SinJax
The boundaries of the zones run along the exact longitude and latitude lines closest to your location. - Kira. (74.52.15.98 13:20, 21 May 2008 (UTC))

I live nearly on top of a confluence (northeast of Indianapolis). Would you say that I could choose one of four grid areas, so that I don't have to travel too far? Halcyone1024

Pittsburgh is annoyingly (or beneficially) split down the middle(the very middle 40.441419, -79.977292)

People here have a point about population density. In some places in the world everyone has a car, but in other places a lot of people don't. I almost think that separate areas should have entirely separate rules or something. But again it all depends on how complicated it should be. It could be super simple like it is now and work okish, or be super complicated and maybe work better or maybe not. Personally I currently live in the middle of nowhere so there aren't enough people to make it work, even if we were willing to drive quite a ways and use large graticules. I have also lived in other places where car ownership is low and for most people on most days the meeting point would be out of reach. - forest

Polar coordinates?

I've played a bit with the map, and found out that my town, Freiburg, is right at the edge of a grid (48°000), and that the grid is fairly large. An alternative implementation could use city centers as fixed points and calculate polar coordinates (distance and angle) from the "random" seed. -- tillwe 132.230.104.57 10:02, 21 May 2008 (UTC)

I think this issue is deeper mate. It would seem that the grid implementation is remarkably unsatisfactory when it comes to places in europe and especially england, however, a random distance from city centers wouldn't work very well in sparse areas like america. Some concept of centre of population is required here. SinJax

The problem with living near an edge is non-existent, I think. The maximum distance to the closest point is always half the length of the diagonal of a grid cell. - Sparky

I think we should try to preserve the simplicity and elegance of the algorithm, which, in it's basic form, relies on only 3 pieces of information: the date, the random seed (stock exchange), and the integral part of your location. Adding information like databases of population density would completely ruin it, if you ask me. You might as well include information about computer use and the percentage of the population that can read english, as there are things that strongly influence the XKCD reader density. - Sparky

Maybe we could use a smaller grid size, and have a convention that in more scarcely populated areas, only the even numbered ones are used, or use the ones closest to city centers or something. We don't have to deal with all the cases in the calculation algorithm, we could have the user execute part of the "algorithm". However, the idea is that the user can't really choose where to go. If the user has any control, it should be limited (at the moment, the user can ofcourse choose any of the ~40000 reachable points on the planet) - Sparky

Good call that sounds workable, fixes the problem in southampton anyways :) SinJax

Python-CGI-JSON-o-Rama implementation

Hey so i've made a python version of the reference implementation that returns a JSON parseable object containing the lat/long when given a date, xloc and yloc. It uses the XKCD cache of dow values. I'll put it online in the next 5 mins :D

Here it is, fairly simple actually, hope someone finds it useful :)

Update: I've tried to make it classify the region recommended for a particular day using the euclidian distance of the colour of the image provided by Google Maps

The code is on the link above, tell me what you think :)


SinJax

Cool! I was thinking of making it act as a web service, but for reference implementation it's more important to be simple than useful :) Zigdon 14:53, 21 May 2008 (UTC)

I mostly canabolised yours anyways! :) But yeh here you go. The webservice version returning AJAX from python can be found: - AJAX geohashing - Example using the web service

The implementation also has the ability to tell you whether the particular area is water, park/forest or other. It does this by checking the colour on google maps and doing a thresholded euclidian distance check against the expected colours for those things. Also If its water it tells you its probably not accessible I wanna try to add more features like this, things like rating locations based on food and letting you say how far your willing to travel

Thanks to User:Sigmund Fraud from the IRC channel for the space on his server with python installed - SinJax


I've added my own python implementation to the main page (only 5 lines) -- lilac

_very_ nice, im in awe! Though it must be said my concentration was usability as a web service and the implementation of that imaging stuff. Still. Very nice stuff. I love python :) - SinJax

Added a shell script implementation to the main page (slightly shorter than the Python version; let the flames begin) -- gnomon

deeply awesome; no flames from me -- lilac
${var:x:y} slicing isn't portable, so you should really call it a "bash script" and change the #!/bin/sh. I came up with nearly the same thing:
(echo 16i; echo -n "$(date +%F)-$(GET xrl.us/djiopen | col)" | md5 | sed 's/.\{16\}/0.&p/g' | tr a-f A-F) | dc
On Linux, it's md5sum, which prints a "-" for stdin, but dc happily eats that. Your web interface for getting the Dow data is much better/more flexible than that ganky little shortcut url to Y! Finance (they have historical data in .csv too, but at a different interface so I didn't include it), but consider using sed instead. You only have to fire up one dc at the end of the pipeline which is much cleaner IMO.
To make it print out my full co-ordinates, I created a graticule file:
echo -e '42\n-71' > graticule
and appended
| paste -d'\0' graticule -
to the pipeline. If you wanted it put it in a script file as you have and pass them as args, you could (untested):
| while read f; do echo $1$f; shift; done
I'll stop here before I get any farther offtopic :) --Decklin