Benford’s Law and Election Data
Why do first digits of votes diverge from Benford’s distribution?
--
There are claims of statistical “proof” of election fraud in the United States. These claims often depend on considering leading digits of vote counts. The resulting distribution does not match the Newcomb-Benford distribution.
Does this failure to match imply “fraud” or “manipulation”? No.
Benford’s Law is not universal — the data set needs certain properties. Electoral counts do not have these properties: we should not expect conformity.
What is Benford’s Law?
Numbers can suffer manipulation. To detect anomalies, we want an expected distribution for comparison.
In some data sets, the leading digit 1 appears much more often than the leading digit 9. That is, more numbers in the data set start with a 1 than a 9.
The astronomer Simon Newcomb first found this ‘law’, viewing logarithmic tables in 1881:
That the ten digits do not occur with equal frequency must be evident to any one making much use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones.
Frank Benford, a physicist, observed the same pattern in 1938. Benford highlights a diverse range of sets approximate this predicted distribution:
When does Benford’s Law apply?
Despite being a ‘law’, it is not universal. It is an observation about some types of data sets. William Goodman restated some guidelines for suitability towards conforming to Benford’s Law:
- A large sample: Small collections of numbers would make small deviations appear noticeable.
- A high span of numerical values: The sample should include values across many orders of size.
- Right-skewed distributions: Conforming sets often have origins in multiplication or combinations.