We see Benfords Law in a lot of real world data sets, with the general derivation that if things are distributed symmetrically on a log scale, then the law holds. However, it's not obvious to me:
- Why is the case that most real-world data would be distributed on a log scale instead of a linear scale? What processes do we have to be aware of that produce these different kinds of datasets?
- Furthermore, we tend to assume the prior (especially in machine learning) that normal distributions are always over the linear scale, but if we analyze the data set and notice that benford's law holds, then is it a better prior to assume that the data is normally distributed on a log scale instead? Is it the case that people tend to model any distribution as normal, when in reality, it is actually log-normal (which I assume means the same as normal over a log scale)?
- Nicholas Taleb, in Black Swan, discusses how people tend to assume a normal distribution when in reality, most things are on a log/power law scale. Is assuming a normal instead of a log normal distribution a good example of this?