H.G. Wells once remarked, “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.”
Wells’ observation was rooted in the significance and the consequent prevalence Statistics had started to gain in the early 19th century. Where some people were enamored of Statistical thinking, there were some who looked at the painfully driven correlations with a skeptical eye.
Darrell Huff, in 1954, decided to equip the ordinary folks with handy information about statistical pitfalls. ‘How to lie with Statistics’ is Huff’s compass to a layman to help him keep number-wielding charlatans at bay. In Huff’s own words,“The crooks already know these tricks; honest men must learn them in self-defence.”
Almost six decades on since the first print, due to lack of any revised editions, examples in the book sound outdated, illustrations look below par work of an average graphics artist and even some of the well-intentioned advice sounds banal. Despite these deficiencies, the book seems to have stood the passage of time.
Its relevance for anyone who wants an initial peek into the world of statistics to whet his critical faculties can’t be overstated. Let’s face it. The flurry of data-laden information coming our way has shot up manifold since the early ’50s. And, with so much data floating around, the statistical chicanery can safely be assumed to have gone up several notches, too.
“How to lie with Statistics” adds no more bewilderment to an already perplexed world of numbers. Author’s advice is pretty simple and easy to comprehend for professionals and greenhorns alike. He avoids the wonkish stuff and instead builds his work around the common traps in the statistical world. Sometimes, these traps are intentionally deployed by the vested interests whereas a lot of times such pitfalls inadvertently sneak into the picture doing even more damage.
One of the basic traps in statistics, according to Huff is the built-in sample bias (a sample is biased when it doesn’t adequately represent the population from which it is drawn). The outcome of a sampling study is as good as the sample it is based on. Nevertheless determining a true random sample is a strenuous task yet almost everyone claims to have found their fit.
Huff asserts that observers need to be highly skeptical each time they look at research data since there is always a bias hidden somewhere.
Sample bias could still be involuntary but an advertiser’s spiel on a particular type of average should never be taken at face value. According to Huff, the average could be anything – a mean, a median or a mode, unless specified.
If a Business school publicizes its mean average salary post the placement season without ever mentioning the ‘mean’ figure, then it’s likely to be hiding something. A median (the central point in a data set) average is always the preferred average when it comes to salaries. Remember, one extraordinarily high figure can pull up the mean average salary by a long shot whereas a median figure depicts a much better picture as to how the B-school has actually fared in placing its students.
Huff, then goes on, and mentions several other tricks that advertisers and propagandists often use to flummox consumers and sway trends. The deliberate use of small samples to exaggerate outcomes, intentional deployment of pictorial graphs that bear no resemblance to the real situation are some of the other con-techniques that Huff discusses in the book.
In the chapter titled ‘The Gee-Whiz Graph’, Huff illustrates how, sometimes, graphs are twisted and molded so as to impart an entirely opposite meaning without falsifying anything. Huff aptly quips later in the book, “There are often many ways of expressing any figure…The method is to choose the one that sounds best for the purpose at hand and trust that few who read it will recognize how imperfectly it reflects the situation.“
Let’s say you have been asked to make a presentation to a bunch of media newbies and have been instructed to include quarterly trends in last one year. Now, this Quarterly spends graph with its dips in July-Sept and Jan-Mar quarters paints a rather grim picture (Graph 1). Whereas your objective is to put an optimistic picture to your audience.
So, you put your creative cap on and manipulate the graph without falsifying the data. Now instead of using Quarterly spends, you plot the same data in terms of cumulative media spends (Graph 2). Which picture looks rosier, you decide!
In the later chapters, Huff sheds light on the dubious world of percentages and fractions. Some of the gimmicks discussed therein are quite commonplace these days such as the use of ‘small base‘ effect to highlight a disproportionate gain, use of significant numbers to impart precision and the use of arithmetic and geometric means to serve the purpose at hand.
At the same time, there are certain ruses that still continue to escape public scrutiny such as the difference between percentages and percentage points (if your profits on an investment have increased from 3% last year to 6% this year, that’s a jump of 3 percentage points and if you want to impress someone, you can term it as a 100% increase).
In the last chapter of the book called “How to talk back to a Statistic”, Huff propounds a five-question trial to test the validity of a given statistic:
- Who says so? Look for conscious and unconscious bias in the data. A magazine may bring up a controversial research on changing sexual preferences to help jack up its tepid sales
- How does he now? Is the sample selected large enough to permit any sort of reliable conclusion
- What’s missing? Look out for the pieces of information that are absent in the data and when provided, will make the given conclusions look lacklustre.
- Did somebody change the subject? One thing is all too often reported as another.
- Does it make sense? Always watch out for those significant numbers, confusing averages (mean, media or mode?), mindless extrapolations and unproven assumptions.
In short, switch on your critical faculties before you accept any research-based conclusion published in your favorite magazine or paper.
‘How to lie with Statistics’ is an old-fashioned, honest manual (despite its name) on statistical artifices that has withstood the knocks of time and is still as pertinent as it was 60 years ago.
The fact that it’s a recommended book in some of the undergraduate statistics courses around the world is in itself a grand testimony to its relevance. I strongly recommend this book to those who have a tangential knowledge of statistics and those who work with data day in and day out and feel besotted by all the graphs, trends and averages.
© BookJelly. All rights reserved