[ExI] Anders on io9

Anders Sandberg anders at aleph.se
Fri Feb 7 21:38:04 UTC 2014


I like a smart audience; you have more or less deduced what I found.
Generic numbers have the Benford distribution of leading digits (and a related one for subsequent ones): the probability of a leading digit x is log10(1+1/x). Why this is so is a rather deep pure math thing (scale and base invariance), well worth pondering: http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1041&context=rgp_rsr But in any case, this means that if you replace a digit with a uniformly distributed digit the expectation increases (btw, Benford's law has a very cute expectation - try deriving it in an arbitrary base b). The same is true for transpositions (but weaker). Deletions and additions work differently, but if they are equally likely the x10 effect of additions dominate the x0.1 effect of deletions.
In practice much depends both on likely typos, what errors are detected directly, and of course what kind of dataset you use. Spike, try running some real world data like http://www.macs.hw.ac.uk/~mcneil/data.html (perhaps turned into integers) and compare to the random numbers. So the finding of bias is not true for every error, error distribution or datasets. It is just pretty likely.
Another cute fact: generic non-round numbers found on the Internet has roughly an exponential distribution of the number of digits. 

Anders Sandberg, Future of Humanity Institute Philosophy Faculty of Oxford University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20140207/6df1d9a2/attachment.html>


More information about the extropy-chat mailing list