How to cheat with Frank Benford

How to cheat with Frank Benford

Pick a number at random from the universe. Not just from inside your head. Open a page of the financial times or look up the size of a planet; convert you height to cubits or measure the weight of your favourite book. Something like that.

Don't actually do it, it's hypothetical. But ask yourself a question. What are the chances of that number starting with a 1? What are the chances or it starting with a 7? What are the chances of it starting with any particular one of the 9 possible starting digits (1, 2, 3, 4, 5, 6, 7, 8 or 9. We're not counting 0, as in 0.5 because it's not the first significant digit.)?

Well you're choosing at random so the chance of your number starting with any one of those 9 digits must be 1 in 9. That's about 11%.

The surprising result of Frank Benford's work is that the number you just plucked from the universe is far more likely to start with a 1 (about 30.1%) and very unlikely to start with a 9 (about 4.6%). And there's a sliding scale for the digits in between.

You can test it yourself. Get a copy of the financial times. Write down every number you see on the front page. These numbers could be stock quotes, dates, ages, populations, profits. Anything (don't include telephone numbers though because they are not proper numbers in as much as they are not expressing an amount of anything).

I did it on 6th [j] March 2007. I'm telling you the date so you can fact-check!

How to cheat with Frank Benford

How to cheat with Frank Benford

These are the numbers I got:

22
2008
1.50
200
11
2
22
3
50
17
20
2
8
12
19
9.6
26
8
8
25
2009
6
5000
19
18.4
9.4
13
6.1
25
26
70
4.2
3.2
700
6
9
20
20
10
16
20
12284.30
2299.78
1342.53
1330.07
3778.21
5932.2
3037.38
4858.85
6904.85
13688.28
23623.00
1.15
1.17
1.29
69
53
65
64
96
7
2.84
14
41
1.481
1.963
755
107.28
210.73
82.50
1.619
675
509
1.325
159.02
95.50
101.50
2.146
97.77
130.34
100.05
100.17
97.20
101.14
3
2.15
4.38
5.62
3.77
4.75
3.99
1.48
4,55
3.33
2.94
2.19
4.36
5.62
15
5
5
10
5
6
5
1
36622
2.20

Now count how many of these numbers start with a 1, a 2, a 3 and so on. Here are the results I got plotted on a graph as percentages. The horizontal line shows my (and hopefully your) initial guess of 1 in 9 or about 11% for all digits:

How to cheat with Frank Benford

So it already looks like Benford might be right. 1s are appearing far more often 9s. Benford's law predicts that if I keep going through the Financial Times and adding more numbers and if I do it every day then the graph will start to look more and more like this:

How to cheat with Frank Benford

With all the random fluctuations ironed out. Is this just a quirk of the Financial Times? I did the same analysis for the population sizes of all the countries in the world and got this graph:

How to cheat with Frank Benford

Analysing countries by land area in kilometres squared gives this graph:

How to cheat with Frank Benford

Benford himself analysed various groups of things like heights of buildings and areas of rivers with similar results.

What's going on? The thorough explanation involves something called scale invariance and is a bit complicated. But there's an easy way to think about it...

What if you were picking a raffle ticket from a raffle instead of a number from the universe? What are the chances of the number starting with a 1 in that case? Well it depends on the size of the raffle.

Suppose there are only two tickets in this raffle numbered 1 and 2.

How to cheat with Frank Benford

Then the chances of picking a ticket starting with a 1 are 50:50. If there are three tickets in the raffle

How to cheat with Frank Benford

the chances are 1 in 3 and so on. With 9 tickets in the raffle numbered 1 to 9

How to cheat with Frank Benford

the chance of you picking the only ticket starting with a 1 is now what we thought it might be intuitively, 1 in 9.

But now add one more ticket to the raffle. This will have "10" printed on it.

How to cheat with Frank Benford

Now there are 2 tickets out of 10 that start with a one and our chances jump back up to 1 in 5!

And the chances just get better the more tickets are added up to 19 tickets. Then back down again form 20 tickets all the way to up to 99. Then the 100s improve matters again and so on.

So the chance of picking a raffle ticket starting with a 1 fluctuates depending on the size of the raffle.

How to cheat with Frank Benford

But picking a number at random for the universe or from the financial times is like picking a ticket from a raffle you don't know the size of. If you don't know the size of the raffle then to work out the chance of your ticket staring with a 1 you need to average to probability from all possible raffles. That's the horizontal line on this graph:

How to cheat with Frank Benford

That average turns out to be about 30.1%. There's a formula for it which goes like this. The probability, P, of a number chosen at random from the universe starting with a particular  digit, d, is:

P(d)=\log_{10}\left(1 - \dfrac{1}{d}\right)

That's Benford's Law and it's used to by Forensic Accountants to detect tax fraud. It can also be used to test the results of academic research and even look for evidence of election rigging.

So if you're going to cheat. Keep Benford's law to hand.

Tags: , ,

blog comments powered by Disqus

View in: Mobile | Standard