Talk:Known Issues

From Geohashing
Revision as of 08:03, 31 July 2008 by imported>Sparr (Origin Bias)

Origin Bias

There is no significant bias towards the origin. This is a naive mistake involving a misunderstanding of the algorithm. Consider a simplified MD5 algorithm that produces a 16 bit (4 digit) hash, ranging from 0000 to FFFF. Split this in half so that each coordinate is two digits, from 00 to FF. To encounter the mistake, one would convert that number to an integer, then prepend a decimal point, so that we get 0.0 0.1 0.2 ... 0.9 0.10 0.11 0.12 ... 0.98 0.99 0.100 0.101 ... 0.254 0.255. This yields a distribution that is biased by 50% towards the bottom quarter of the graticule, and by 4% towards the 1/10th divisions of the graticule. This is not the correct implementation of the algorithm. What actually happens, as is illustrated on the algorithm page, is that the hexadecimal number is, in effect, converted to an integer that is a number of 1/256ths (in our small example, actually 1/18446744073709551616ths in the real algorithm), yielding an even distribution of .0000 .0039 .0078 ... .9883 .9922 .9961. The final point worth making is my use of 'significant' at the beginning... For a given graticule there is actually an average bias of 1/36893488147419103232th of one degree towards the origin because there is never a result at the 'top' of the origin, 256/256ths in our example, such a point would be 0/256ths in the next Graticule. Sparr 08:03, 31 July 2008 (UTC)