๐Ÿ”ฌ CLT Laboratory

Central Limit Theorem Laboratory

Watch the sample mean distribution converge to normal as sample size grows. The CLT is why actuaries can model aggregate claims with the normal distribution.

Population Distribution

Sampling Parameters

n = 5
1,000 draws

Results

๐Ÿ“ What Is the Central Limit Theorem?

The Central Limit Theorem is one of the most important results in all of probability and statistics. It says:

If Xโ‚, Xโ‚‚, ..., Xn are i.i.d. with E[Xแตข] = ฮผ and Var(Xแตข) = ฯƒยฒ, then
as n โ†’ โˆž:   โˆšn ยท (X_n โˆ’ ฮผ) / ฯƒ   โ†’d   N(0, 1)

In plain language: the distribution of the sample mean (xฬ„ = (Xโ‚+โ€ฆ+Xn)/n) approaches a normal distribution as the sample size grows โ€” no matter what the original population distribution looks like.

The Two Key Facts

Distribution of xฬ„ for Three Sample Sizes n = 1 (population) wide spread โ†’ n = 5 narrower โ†’ n = 30 nearly normal Normal approx

๐Ÿ” The Convergence Journey โ€” Step by Step

Step 1: Take one observation xฬ„ is just a single observation โ€” looks like the population
n = 1 โ€” xฬ„ follows the population distribution
Step 2: Average n=5 observations Starts to look a bit more symmetric
n = 5 โ€” starting to converge, but still rough
Step 3: Average n=30 observations Now it's starting to look bell-shaped
n = 30 โ€” good Normal approximation (common rule of thumb)
Step 4: Average n=100 observations Tight bell โ€” variance is ฯƒยฒ/100
n = 100 โ€” almost perfectly normal, much narrower

๐Ÿ“Š Why Does the Variance Shrink?

Each observation adds independent information. When you average n i.i.d. variables:

Var(xฬ„) = Var( (Xโ‚ + ... + Xn) / n )
   = (1/nยฒ) ยท (Var(Xโ‚) + ... + Var(Xn))   (independence)
   = (1/nยฒ) ยท n ยท ฯƒยฒ   (identical variance)
   = ฯƒยฒ / n

The standard deviation of xฬ„ is ฯƒ/โˆšn. This means to halve the uncertainty, you need four times the sample size. Here's what it looks like:

ฯƒ/โˆšn Sample size n ฯƒ/โˆšn n=1 n=5 n=10 n=50 n=100

Notice the huge drop between n=1 and n=10, then diminishing returns. That's why on the exam, you're usually told n = 30+ is "large enough" (for most distributions).

๐ŸŽฏ Why Actuaries Need the CLT

Real insurance problems don't come with "the data is normally distributed" pre-attached. The CLT gives actuaries a superpower: they can make normal approximations about aggregate claims from any underlying loss distribution.

๐Ÿ“‹ Exam P Applications

  • Sum of i.i.d. variables approximates normal โ€” even if the individual variables are skewed (like claim amounts)
  • Approximate probabilities for total loss: S = Xโ‚ + ... + Xn โ‰ˆ N(nฮผ, nฯƒยฒ)
  • Normal approximation to the Binomial: when n is large, Bin(n,p) โ‰ˆ N(np, np(1-p))
  • Normal approximation to Poisson: Poisson(ฮป) โ‰ˆ N(ฮป, ฮป) for large ฮป

๐Ÿข Real-World Use

  • Pricing: Total premium = E[S] + safety loading based on standard deviation
  • Reserving: Estimate the 95th percentile of total claims
  • Reinsurance: Price stop-loss treaties using normal approximations
  • Solvency: Calculate probability of ruin / capital adequacy

โš ๏ธ When Does the CLT Not Apply?

The CLT is powerful, but it has conditions. Here's when to be careful:

n โ‰ฅ 30
Rule of thumb for "large n" works for most distributions
Symmetric
Distributions need n โ‰ˆ 15-20
Heavy-tailed
May need n โ‰ซ 30 to converge

Try it! Switch the lab to Exponential (heavily skewed) and watch n=30 vs Bimodal. You'll see the skew needs larger n to fully converge.

๐Ÿ“ Exam P โ€” How This Gets Tested

The CLT appears on Exam P in predictable ways. Here are the three most common question patterns:

Pattern 1: Normal Approximation of Sums

Standard You have n i.i.d. losses with mean ฮผ and variance ฯƒยฒ. Find P(total loss > c).

Let S = Xโ‚ + ... + Xn. Then E[S] = nฮผ, Var(S) = nฯƒยฒ.
P(S > c) โ‰ˆ 1 โˆ’ ฮฆ( (c โˆ’ nฮผ) / (ฯƒโˆšn) )

Pattern 2: Normal Approximation of Sample Mean

Common Given a sample from a population, find P(xฬ„ is within ฮด of ฮผ).

P(|xฬ„ โˆ’ ฮผ| < ฮด) โ‰ˆ ฮฆ(ฮดโˆšn/ฯƒ) โˆ’ ฮฆ(โˆ’ฮดโˆšn/ฯƒ)

Pattern 3: Normal Approximation to Binomial

Frequent Large n binomial probability:

X ~ Bin(n, p)   โ†’   approx N(np, np(1โˆ’p))
P(X โ‰ค k) โ‰ˆ ฮฆ( (k + 0.5 โˆ’ np) / โˆš(np(1โˆ’p)) )   (continuity correction)

Note: Rules P and CAS exam questions are multiple-choice, so they often test whether you can set up the CLT approximation correctly โ€” not just compute the final z-score.