🔬 CLT Laboratory

Central Limit Theorem Laboratory

Watch the sample mean distribution converge to normal as sample size grows. The CLT is why actuaries can model aggregate claims with the normal distribution.

Population Distribution

Distribution

Sampling Parameters

Sample size (n)

n = 5

Number of samples (draws)

1,000 draws

Animation

Results

📐 What Is the Central Limit Theorem?

The Central Limit Theorem is one of the most important results in all of probability and statistics. It says:

If X₁, X₂, ..., Xn are i.i.d. with E[Xᵢ] = μ and Var(Xᵢ) = σ², then
as n → ∞: √n · (X_n − μ) / σ →^d N(0, 1)

In plain language: the distribution of the sample mean (x̄ = (X₁+…+Xn)/n) approaches a normal distribution as the sample size grows — no matter what the original population distribution looks like.

The Two Key Facts

E[x̄] = μ — the sample mean is centered at the population mean
Var(x̄) = σ²/n — the variance shrinks as n increases

🔍 The Convergence Journey — Step by Step

n = 1 — x̄ follows the population distribution

n = 5 — starting to converge, but still rough

n = 30 — good Normal approximation (common rule of thumb)

n = 100 — almost perfectly normal, much narrower

📊 Why Does the Variance Shrink?

Each observation adds independent information. When you average n i.i.d. variables:

Var(x̄) = Var( (X₁ + ... + Xn) / n )
   = (1/n²) · (Var(X₁) + ... + Var(Xn)) (independence)
   = (1/n²) · n · σ² (identical variance)
   = σ² / n

The standard deviation of x̄ is σ/√n. This means to halve the uncertainty, you need four times the sample size. Here's what it looks like:

Notice the huge drop between n=1 and n=10, then diminishing returns. That's why on the exam, you're usually told n = 30+ is "large enough" (for most distributions).

🎯 Why Actuaries Need the CLT

Real insurance problems don't come with "the data is normally distributed" pre-attached. The CLT gives actuaries a superpower: they can make normal approximations about aggregate claims from any underlying loss distribution.

📋 Exam P Applications

Sum of i.i.d. variables approximates normal — even if the individual variables are skewed (like claim amounts)
Approximate probabilities for total loss: S = X₁ + ... + Xn ≈ N(nμ, nσ²)
Normal approximation to the Binomial: when n is large, Bin(n,p) ≈ N(np, np(1-p))
Normal approximation to Poisson: Poisson(λ) ≈ N(λ, λ) for large λ

🏢 Real-World Use

Pricing: Total premium = E[S] + safety loading based on standard deviation
Reserving: Estimate the 95th percentile of total claims
Reinsurance: Price stop-loss treaties using normal approximations
Solvency: Calculate probability of ruin / capital adequacy

⚠️ When Does the CLT Not Apply?

The CLT is powerful, but it has conditions. Here's when to be careful:

n ≥ 30

Rule of thumb for "large n" works for most distributions

Symmetric

Distributions need n ≈ 15-20

Heavy-tailed

May need n ≫ 30 to converge

Requires i.i.d. variables — independently drawn from the same distribution
Requires finite variance — Cauchy distribution has no mean/variance, CLT does not apply
Convergence speed varies: symmetric unimodal distributions converge fast, skewed/heavy-tailed ones slow
CLT says nothing about n=∞ — practical convergence depends on the underlying distribution

Try it! Switch the lab to Exponential (heavily skewed) and watch n=30 vs Bimodal. You'll see the skew needs larger n to fully converge.

📝 Exam P — How This Gets Tested

The CLT appears on Exam P in predictable ways. Here are the three most common question patterns:

Pattern 1: Normal Approximation of Sums

Standard You have n i.i.d. losses with mean μ and variance σ². Find P(total loss > c).

Let S = X₁ + ... + Xn. Then E[S] = nμ, Var(S) = nσ².
P(S > c) ≈ 1 − Φ( (c − nμ) / (σ√n) )

Pattern 2: Normal Approximation of Sample Mean

Common Given a sample from a population, find P(x̄ is within δ of μ).

P(|x̄ − μ| < δ) ≈ Φ(δ√n/σ) − Φ(−δ√n/σ)

Pattern 3: Normal Approximation to Binomial

Frequent Large n binomial probability:

X ~ Bin(n, p) → approx N(np, np(1−p))
P(X ≤ k) ≈ Φ( (k + 0.5 − np) / √(np(1−p)) ) (continuity correction)

Note: Rules P and CAS exam questions are multiple-choice, so they often test whether you can set up the CLT approximation correctly — not just compute the final z-score.