Difference between revisions of "Talk:Known Issues"

From Geohashing
imported>Sparr
(Origin Bias)
imported>Sparr
(Origin Bias)
Line 1: Line 1:
 
== Origin Bias ==
 
== Origin Bias ==
  
There is no significant bias towards the origin.  This is a naive mistake involving a misunderstanding of the algorithm.  Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF.  Split this in half so that each coordinate is two digits, from 00 to FF.  To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255.  This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule.  This is not the correct implementation of the algorithm.  What actually happens, as is illustrated on the [[algorithm]] page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961.  The final point worth making is my use of 'significant' at the beginning...  For a given [[graticule]] there is actually an average bias of 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the origin, 256/256ths in our example, such a point would be 0/256ths in the next Graticule.  [[User:Sparr|Sparr]] 08:03, 31 July 2008 (UTC)
+
There is no bias towards the origin.  This is a naive mistake involving a misunderstanding of the algorithm.  Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF.  Split this in half so that each coordinate is two digits, from 00 to FF.  To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255.  This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule.  This is not the correct implementation of the algorithm.  What actually happens, as is illustrated on the [[algorithm]] page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961.  [[User:Sparr|Sparr]] 07:59, 31 July 2008 (UTC)
 +
* OK, so I lied.  There is a very very tiny bias, equal to 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the origin.  In the example above consider that you can get a point at 0/256ths but not at 256/256ths, because that would be 0/256ths in the next graticule [[User:Sparr|Sparr]] 08:05, 31 July 2008 (UTC)

Revision as of 08:05, 31 July 2008

Origin Bias

There is no bias towards the origin. This is a naive mistake involving a misunderstanding of the algorithm. Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF. Split this in half so that each coordinate is two digits, from 00 to FF. To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255. This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule. This is not the correct implementation of the algorithm. What actually happens, as is illustrated on the algorithm page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961. Sparr 07:59, 31 July 2008 (UTC)

  • OK, so I lied. There is a very very tiny bias, equal to 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the origin. In the example above consider that you can get a point at 0/256ths but not at 256/256ths, because that would be 0/256ths in the next graticule Sparr 08:05, 31 July 2008 (UTC)