Sunday, February 16, 2025

The Tale of the Golden Curve: The Mystery of Normal Distribution in Statistics unfolds

 


Kolkata, A Lazy Afternoon

It was a misty winter afternoon in Kolkata, and Anirban Chatterjee, a statistics professor at Presidency University, sipped his cha (tea) from a clay bhar at College Street. The air smelled of old books and the faint scent of fried telebhaja from a roadside stall.

His student, Riddhiman Dutta, had been pestering him for weeks.

"Sir, you always talk about this ‘Normal Distribution’ as if it’s some magic spell. But how can a simple curve explain so much about life?"

Professor Anirban smiled. "You know, Riddhi, the normal distribution is more than just numbers. It’s hidden in the city around us—right here in Kolkata. Let me tell you a story."

The Lottery Ticket and the Bell Curve

It all started when Bijoy Karmakar, a local lottery seller near Esplanade, noticed something odd. Every day, thousands of people bought lottery tickets. Some bought just one, some bought ten, and a few—like the overenthusiastic Subhojit da, who ran a sweet shop in Garia—bought fifty at once.

One evening, Bijoy came to Anirban’s door. "Dada, ekta jinish bujhte parchhi na!" (Brother, I don’t understand one thing!) he said, scratching his head.

"What is it, Bijoy?" Anirban asked.

"Every day, if I count how many people buy how many tickets, I see a pattern! Most people buy around 5-10 tickets. Some buy fewer, some buy more. But very few buy 50 or 100! And this shape, when I plot it, looks like a big, curved hill!"

Anirban grinned. "Bijoy, you’ve just discovered the Normal Distribution!"

The Magic Formula Behind the Curve

"Let me explain with some math," said Anirban, pulling out his notebook.
"The normal distribution follows this beautiful formula:"

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}

Riddhiman’s eyes widened.

"What do these symbols mean, Sir?"

"Good question!" Anirban explained.

  • x is the variable we’re measuring (like the number of tickets bought).
  • μ\mu (mu) is the mean, or the average number of tickets bought.
  • σ\sigma (sigma) is the standard deviation, which tells us how spread out the values are.
  • ee is Euler’s number (approximately 2.718), which appears in many natural growth processes.
  • "This formula ensures that most people buy tickets around the average, while fewer and fewer people buy extremely high or low amounts—forming a bell-shaped curve."

    From Fish Markets to Cricket Scores

    Anirban looked at Bijoy. "You see, this curve isn’t just about lottery tickets. It’s everywhere."

    "Think about the weight of fish at Gariahat market. Most fish weigh around 1-2 kg. A few are much smaller, a few are much bigger, but the majority are around the average."

    He continued:
    "Even in cricket, if we plot the number of runs scored by Virat Kohli in every match, you’ll see that most scores are close to his average. A few times he gets out for a duck, a few times he scores a century, but most of the time, he scores somewhere in the middle. That’s normal distribution!"

    Riddhiman was fascinated.
    "So… life is predictable after all?"

    Anirban laughed.
    "Not quite! Normal distribution helps us understand patterns, but it doesn’t predict the future with certainty."

    Understanding the Empirical Rule

    "Let me teach you another secret," Anirban said. "This curve follows something called the 68-95-99.7 Rule."

    "That means if you plot the heights of people in Kolkata, about 68% will have a height close to the city’s average height. 95% will be close to twice that range. And almost everyone—99.7%—will be within three standard deviations from the average."

    Anirban took out a piece of paper and started sketching a bell curve to explain the 68-95-99.7 Rule, also known as the Empirical Rule.

    "Imagine we collect the heights of thousands of people across Kolkata—from school children at South Point to elderly gentlemen at Coffee House. If we plot all their heights on a graph, the distribution would form a bell-shaped curve—the normal distribution."

    "The mean, μ\mu, is the central value. Most people’s heights will be clustered around this mean, and only a few will be very tall or very short. But how do we quantify this spread? That’s where standard deviation (σ\sigma) comes in."



    Anirban drew a neat bell curve and marked the sections.

    1. 68% of Data Within 1 Standard Deviation (μ±1σ\mu \pm 1\sigma)

    "If the average height of a Kolkata man is 165 cm and the standard deviation is 7 cm, then 68% of men will have heights between:"

    1657=158 cmand165+7=172 cm165 - 7 = 158 \text{ cm} \quad \text{and} \quad 165 + 7 = 172 \text{ cm}

    "That means roughly 7 out of 10 men in Kolkata will have heights between 158 cm and 172 cm."

    2. 95% of Data Within 2 Standard Deviations (μ±2σ\mu \pm 2\sigma)

    "Expanding further, 95% of the men will fall within 2 standard deviations of the mean:"

    165(2×7)=151 cmand165+(2×7)=179 cm

    "Now, almost everyone (19 out of 20 people) will have a height between 151 cm and 179 cm."

    3. 99.7% of Data Within 3 Standard Deviations (μ±3σ\mu \pm 3\sigma)

    "Finally, nearly all men (99.7%) will have heights within 3 standard deviations of the mean:"

    165(3×7)=144 cmand165+(3×7)=186 cm

    "This means that only 0.3% (or 3 in 1000) people will have heights below 144 cm or above 186 cm. These extreme values—very short or very tall individuals—are the rare cases that fall into the tails of the curve."

    Anirban now pointed to the bell curve he drew:

    • The central peak (mean, μ\mu) represents the most common heights.
    • Moving left or right by 1σ\sigma covers 68% of people.
    • Moving further to 2σ\sigma covers 95%.
    • Going all the way to 3σ\sigma captures nearly everyone (99.7%).

    "So, if you see a person in Kolkata who is 190 cm tall, they are an outlier because they are beyond the typical 99.7% range."

    "This concept is not just about heights! It applies to many real-world scenarios," Anirban continued. If the average salary of IT professionals in Kolkata is ₹8 lakhs per year with a standard deviation of ₹2 lakhs, then: 68% earn between ₹6L and ₹10L, 95% earn between ₹4L and ₹12L, 99.7% earn between ₹2L and ₹14L.

    "Now you see why the normal distribution is so powerful," Anirban said, finishing his tea.
    "It helps us understand how most things in life cluster around an average, with fewer and fewer extreme cases. This allows businesses, researchers, and scientists to make predictions and take decisions based on probabilities."

    Moral of the Story: Life Follows a Pattern

    Riddhiman leaned back in his chair, staring at the streets of Kolkata. "Sir, the world really is normal… well, most of the time!"

    Bijoy chuckled. "Dada, next time I sell lottery tickets, I’ll check if my sales follow this curve!"

    Anirban smiled. "Good idea! But remember, in statistics, luck doesn’t follow a normal distribution—it’s always unpredictable!"

    And with that, the tram whistled past, carrying the knowledge of the Normal Distribution through the heart of Kolkata.



    Have you noticed the 68-95-99.7 Rule in your life? Share your thoughts below! 😃

    No comments:

    Post a Comment

    Understanding Z-Tests with Python: The Statistical Battle Between Hrithik’s Fighter and SRK’s Pathaan

    A Cold January in Kolkata: The Curious Case of Fighter It was a chilly January evening in Kolkata, and the iconic Priya Cinema in South Kol...