You find yourself in a corporate boardroom with the leadership team of a retail company, grappling with the findings of a consultant’s study on shoplifting. This company operates 1,000 stores across the country, half of them in urban settings and the other half in rural areas. The consultant’s presentation is projected on the wall, and his conclusion is clear in bold letters: “The branches with the highest theft rate are primarily in rural areas.” The CEO, after a brief moment of stunned silence, takes charge. “Let’s see those hillbillies steal from us then! From now on, we will install additional safety systems in all rural branches. Do we all agree?”
But something feels off. As you sit back, you realize this conclusion might be misleading. While seemingly logical, the consultant’s focus on location overlooks a critical factor that could change everything. You ask him to pull up a new list: the 100 branches with the lowest theft rates. And when the data appears, your suspicion is confirmed. Those stores are also in rural areas. The CEO’s conclusion—based solely on location—is problematic. What’s truly driving the variation in theft rates is not geography but the size of the stores. Rural branches tend to be smaller, meaning the theft rate is more sensitive to fluctuations, leading to larger store discrepancies.
This perfectly demonstrates the law of small numbers—a concept that most of us struggle to grasp intuitively, yet it plays a powerful role in distorting our understanding of data. Let’s delve into why small samples so often lead us astray.
The Fallacy of Small Numbers: A Deceptive Perception of Patterns
The law of small numbers is one of the most commonly misunderstood concepts in statistics. It describes the tendency for small sample sizes to produce results not representative of the broader population. In other words, when we take a small number of data points, even a random variation can lead to exaggerated patterns or trends. This issue is exacerbated by human nature: we have an innate tendency to seek patterns and explanations, especially when presented with statistical findings.
When the consultant presented the finding that theft rates were higher in rural areas, the board assumed that location was the deciding factor. This was a knee-jerk reaction to a seemingly clear and actionable conclusion. However, the true problem lay in the data set being too small or unrepresentative to draw such a firm conclusion. In smaller branches, the theft rate could vary significantly because a single instance of shoplifting has a much larger impact in a small store than in a larger one. This data variability is more noticeable in smaller samples, and when we do not understand this concept, we risk overestimating the importance of one-off data points.
For example, if one rural store had a theft that amounted to a significant portion of its sales, the theft rate would appear much higher than it truly is. In comparison, a larger urban store, even if it experienced the same theft, would see a much smaller change in its theft rate due to the sheer volume of sales and transactions. These statistical distortions can lead people to draw hasty, faulty conclusions, resulting in bad business decisions, misallocated resources, and even public policy failures.
A Perfect Illustration: The Weight of Employees
To clarify the concept of the law of small numbers, consider the example of measuring the average weight of employees in two stores—one large and one small. The average weight in a large store with 1,000 employees will likely reflect the general population’s average weight. It is unlikely that any single individual would have enough influence on the overall average to drastically alter the result. For instance, if the store hires a particularly tall or heavy person, the other 999 employees will still dilute their weight, and the average will remain relatively stable.
Now, consider the small store, which only employs two individuals. The weight of just one employee could dramatically change the average for the entire store. If one employee is significantly heavier or lighter than the other, the average weight will shift, making the store seem either exceptionally heavy or light. This is because the small sample size amplifies the effect of any individual data point.
This same principle applies to many other situations. For example, in small teams, an outlier can skew performance data. If one employee is exceptionally productive, their performance will inflate the overall average, making it seem like the team is performing better than it is. Conversely, if one employee is underperforming, it can outsize the team’s overall performance. In both cases, the small sample size distorts how we perceive the team’s effectiveness. Understanding this dynamic helps avoid drawing conclusions based on incomplete or misleading data.
The Impact on Business Decisions
In business, the law of small numbers has far-reaching consequences. When companies make decisions based on small or unrepresentative data, they risk making misguided or ineffective choices. In the case of the shoplifting study, the CEO’s decision to install extra security measures in all rural stores was based on a flawed interpretation of the data. By focusing only on the stores with the highest theft rates, the CEO failed to consider the impact of store size on the theft rate. Smaller stores, especially those in rural areas, tend to experience more dramatic variations in theft rates, which is a statistical quirk and not necessarily a trend to act on.
This misinterpretation could have resulted in unnecessary costs and resource allocation. Instead of investing in security measures for all rural stores, the company could have taken a more nuanced approach by considering store size and focusing efforts on branches that exhibited consistently high theft rates rather than acting on outliers. The same principle applies in many other business scenarios.
Consider a company that evaluates the performance of a new product by analyzing the sales numbers of a few test locations. If those locations are small or not representative of the broader market, the company may mistakenly assume that the product is a huge success or failure. Small data sets are prone to fluctuations, and using them as the basis for high-stakes business decisions is risky. Companies should aim to use larger and more representative data samples for better accuracy. This would ensure that the insights they gather are meaningful and can be used to guide long-term strategies rather than being driven by outliers.
A Statistical Misstep: Start-ups and the IQ Fallacy
Let’s now apply the law of small numbers to a different area: the hiring practices in start-ups. Imagine reading a headline that reads: “Start-ups Employ Smarter People.” This statement is based on a study that calculated the average IQ of employees in various companies and found that start-ups had higher average IQs. On the surface, this might sound like a powerful argument for the intelligence of those working in new businesses. However, it’s another classic case of the law of small numbers distorting the data.
Start-ups typically have smaller teams, and as a result, their IQ scores are more likely to fluctuate dramatically. Suppose one start-up happens to hire a highly intelligent person. In that case, their high IQ will pull the average up, making it seem like the entire company comprises exceptionally smart individuals. Conversely, if a start-up hires someone with a lower IQ, that will affect the average, but the impact will be greater because there are fewer data points to balance it out.
These fluctuations are less likely to happen in larger companies, such as established corporations with hundreds or thousands of employees. The data from such a large sample is more stable and accurately reflects the general population. By focusing on a small number of employees, the study on start-ups simply showcases the variability that comes with small sample sizes. In short, while the study might be statistically accurate, its conclusions are misleading and fail to provide any actionable insights about the nature of start-up hiring practices.
This is a classic example of how small sample sizes can lead to faulty conclusions, which may sound impressive but fail to hold up under scrutiny. When reading studies or news reports based on small samples, it’s important to consider the size and scope of the data before drawing any conclusions. Large corporations, for example, will generally offer a more stable and reliable dataset than start-ups, which are prone to the extremes inherent in small groups.
Applying the Law of Small Numbers to Everyday Life
The law of small numbers is not just a problem in business or research. We encounter it in various aspects of our everyday lives. For example, in social media, a single viral post can create the illusion that an individual or brand is more influential than they are. The success of a viral post can depend on a wide range of factors, including timing, audience engagement, and even luck. However, when people observe these viral successes, they may mistakenly believe that this success is the norm, leading them to misjudge a brand’s or individual’s overall effectiveness or popularity.
Similarly, when people make purchasing decisions based on a handful of online reviews, they might be influenced by the experiences of a few customers who may not represent the broader customer base. This is particularly problematic in niche markets or products with a small following. Suppose a product receives a few overwhelmingly positive reviews. In that case, it may seem like a must-have, but these reviews might not indicate the product’s quality for a larger audience.
The same bias can apply to personal experiences. For example, suppose one person in your social circle has a particularly good experience with a service or product. In that case, you might be more likely to trust that product or service, even though their experience could result from random factors. While valuable, personal anecdotes often fail to account for the variability in larger populations. This is why looking beyond individual experiences and considering broader data before making decisions is important.
The Importance of Large, Representative Samples
To counter the effects of the law of small numbers, it’s essential to rely on large, representative samples. In business, this means using a broad range of data to make decisions rather than focusing on a few data points. For example, when evaluating a product’s success, it’s important to collect feedback from a wide range of customers across different demographics and regions. This helps ensure that the data is representative of the broader market and reduces the risk of being misled by extreme cases.
Larger sample sizes tend to produce more reliable and stable results. As the number of data points increases, the impact of outliers diminishes, allowing us to gauge the overall trend more accurately. This is why large-scale studies, surveys, and experiments are preferred when making decisions that affect a large group of people or resources. In the case of the shoplifting study, a larger sample size—incorporating a diverse range of store sizes, locations, and other variables—would have provided a more accurate picture of theft rates, helping the CEO make better-informed decisions.
Relying on larger samples when making decisions is also essential for individuals. Whether evaluating the success of a financial strategy, choosing a health plan, or deciding on a product to buy, it’s important to seek out diverse perspectives and large sets of data before jumping to conclusions. By considering the bigger picture, we can avoid the biases that come with small, unreliable samples and make better, more informed decisions.
The Law of Small Numbers in Practice: Avoiding Cognitive Bias
The law of small numbers teaches us an important lesson: our intuitions about patterns and averages are often wrong, especially when dealing with small groups. It’s easy to fall prey to cognitive biases, leading us to misinterpret data and draw faulty conclusions. The most common cognitive bias at play here is the availability heuristic, which causes us to overemphasize the most readily available information, such as extreme cases or outliers. By recognizing the law of small numbers, we can avoid this bias and ensure that we make decisions based on more representative, reliable data.
This awareness can help us become more rational in our decision-making and guard against errors in judgment. Whether we are analyzing business performance, reading research studies, or simply interpreting the world around us, it’s crucial to remember that small sample sizes can distort reality. By seeking out larger, more representative data, we can avoid the pitfalls of small numbers and make better decisions that reflect the true picture of the situation.
Conclusion: Embracing the Law of Large Numbers
The law of small numbers serves as a cautionary tale in a world filled with data. From business decisions to media reports to personal judgment, we are constantly presented with numbers and statistics that may be misleading if not viewed through the proper lens. Small samples can exaggerate trends and lead to misguided conclusions, so taking a step back and evaluating the larger context is crucial. Only by recognizing the inherent randomness in small data sets can we guard against the cognitive biases that lead us to misinterpret the world around us.
Ultimately, the law of small numbers is not just about statistics—it’s about how we think. It teaches us to be skeptical of simple narratives and to seek out the broader picture. By doing so, we can make better, more rational decisions grounded in the reality of large, reliable data.
This article is part of The Art of Thinking Clearly Series based on Rolf Dobelli’s book.