We see numbers everywhere in our daily lives. Be it the news or be it the latest COVID-related narrative to get vaccinated or not get vaccinated, we see arguments and narratives driven by numbers. Hidden behind such narratives is the flaky field of statistics. To be fair and more precise, the field of statistics is built fairly robustly (just like most sciences strive to), but the challenges come from the flaky usage of the field. More often than not, it is people who wish to have their way at all costs that try to take shortcuts by using manipulative statistics. If you are not well-versed with the field, it is very easy to get misled by manipulative statistics.
This article first analyses 5 ways in which manipulative statistics can mislead us, and then goes to figure out how one can overcome such manipulative statistics.
This essay is supported by Generatebg
1. Misleading by way of misinterpretation
Misleading by way of misinterpretation is something that most people are familiar with. This is the most common type of manipulation, and it is also the easiest to pull off. In this case, a statistician can take a few numbers from a study and interpret them in a way that suits his narrative. For instance, a statistician can take a study that finds a correlation between an action and a particular outcome, and interpret it as evidence that taking that action will cause the outcome. As an absurd example, let’s say that ice cream sales in a particular city are relatively high on days where traffic accidents are relatively high (which is known as positive correlation in the field). In this case, a manipulative statistician might claim that ice cream causes traffic accidents. Notice how correlation and causality are very different things. Of course, in the real world, manipulative statistics by misinterpretation is not so easy to pick apart. A statistician can also take a study that finds no correlation, and claim that it shows there is no link between the action and the outcome. Either way, it is to peddle a convenient narrative that benefits the statistician or his organization in some way.
2. Misleading by way of framing
In this case, data can be presented in such a manner that makes them look better or worse than they actually are. For example, let us assume that you have a dataset which shows that 20 percent of people in a country were vaccinated last year and 40 percent of people in that country were vaccinated this year. If the statistician wants to show a dramatic increase in vaccination this year, he or she could present it as 200% percent more people have been vaccinated this year (notice that the comparison to last year, and the precise number of people vaccinated during last year are conveniently left out). This could be a ploy used by a vaccine-manufacturer to boost its image to its stock holders, for instance.
3. Misleading by way of ignoring context
This type of misleading is best seen when someone tries to make an argument about one number without considering other variables at play. It could also happen when someone uses statistical tools incorrectly. A good example would be to not consider the control group when comparing vaccines against each other. Another example would be comparing apples with oranges (if you pardon the cliché); in this case a person might compare drug efficacy for two drugs without considering side effects, effectiveness in combination therapy etc.
4. Misleading by way of cherry-picking
Only detailing positive aspects while ignoring negative aspects is cherry picking. It can be done with single numbers or even with whole datasets. For example, a statistician can pick figures that only show that the price of consumer-goods has increased, without mentioning that increased inflation figures of the transaction currency.
A rather crude fictitious day-to-day example would be a mom claiming proudly that her child placed second in her class. It would be a real pity if she didn’t mention the fact that it was her child’s guitar class, and there were just two children in that class. In reality, her child placed last in a class of two. This is a crude fictitious example, but you get the picture.
5. Misleading by way of making up data
Data can also be made up entirely, in which case it is called data fabrication. In fact, there have been cases where researchers have been caught fabricating results. You may wonder why anyone would do such a thing but here’s the deal – researchers get funding based on their research output, and if they do not get many publications in high-ranking journals, they might lose their funding. This puts enormous pressure on researchers to publish often, which leads to fabrication and misrepresentation – unless we have systems that prevent such things from happening!
Conclusion
Long story short: Don’t trust statistics, figures or studies you didn’t compile yourself. Question everything and also check whether there aren’t any specific interests (be it economical, ideological or other) behind them. It is often the case, that it is nearly impossible to verify anything about numbers and statistics claimed in news or publications, because you or I simply do not have access to the rest of the data required to check the validity of these claims. Based on this knowledge, our default mode should be to reject statistical narratives, unless we come across robust statistical methods used exhaustively to arrive at the result. Unless this is proved, the inclination should be to reject or dismiss the narrative or study.
I hope you found this article interesting and useful. If you’d like to get notified when interesting content gets published here, consider subscribing.
Further reading that might interest you: 3 Exploits of The Anchoring Effect
Comments