Difference between revisions of "Talk:Known Issues"

From Geohashing
imported>Sparr
(Origin Bias)
imported>Ted
(added historic-pins note.)
Line 3: Line 3:
 
There is no bias towards the origin.  This is a naive mistake involving a misunderstanding of the algorithm.  Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF.  Split this in half so that each coordinate is two digits, from 00 to FF.  To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255.  This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule.  This is not the correct implementation of the algorithm.  What actually happens, as is illustrated on the [[algorithm]] page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961.  [[User:Sparr|Sparr]] 07:59, 31 July 2008 (UTC)
 
There is no bias towards the origin.  This is a naive mistake involving a misunderstanding of the algorithm.  Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF.  Split this in half so that each coordinate is two digits, from 00 to FF.  To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255.  This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule.  This is not the correct implementation of the algorithm.  What actually happens, as is illustrated on the [[algorithm]] page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961.  [[User:Sparr|Sparr]] 07:59, 31 July 2008 (UTC)
 
* OK, so I lied.  There is a very very tiny bias, equal to 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the graticule.  In the example above consider that you can get a point at 0/256ths but not at 256/256ths.  This only applies when considering a lone graticule, as the point in question could be at the 'bottom' of the next graticule.  [[User:Sparr|Sparr]] 08:07, 31 July 2008 (UTC)
 
* OK, so I lied.  There is a very very tiny bias, equal to 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the graticule.  In the example above consider that you can get a point at 0/256ths but not at 256/256ths.  This only applies when considering a lone graticule, as the point in question could be at the 'bottom' of the next graticule.  [[User:Sparr|Sparr]] 08:07, 31 July 2008 (UTC)
 +
* The sample size is still pretty low, but you can check out a visual representation of the distribution [http://www.manyfriends.com/xkcd/index.html?lat=36&long=-122 here].  Btw, feel free to use this URL (altered for your Lat/Long, of course) on your home graticule page.  One example is under "notable dates" on the [[Santa Cruz, California]] page.

Revision as of 20:40, 31 July 2008

Origin Bias

There is no bias towards the origin. This is a naive mistake involving a misunderstanding of the algorithm. Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF. Split this in half so that each coordinate is two digits, from 00 to FF. To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255. This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule. This is not the correct implementation of the algorithm. What actually happens, as is illustrated on the algorithm page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961. Sparr 07:59, 31 July 2008 (UTC)

  • OK, so I lied. There is a very very tiny bias, equal to 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the graticule. In the example above consider that you can get a point at 0/256ths but not at 256/256ths. This only applies when considering a lone graticule, as the point in question could be at the 'bottom' of the next graticule. Sparr 08:07, 31 July 2008 (UTC)
  • The sample size is still pretty low, but you can check out a visual representation of the distribution here. Btw, feel free to use this URL (altered for your Lat/Long, of course) on your home graticule page. One example is under "notable dates" on the Santa Cruz, California page.