Central limit theorem

Pocket

The Central Limit theorem is, together with the Normal Distribution, an essential point to understand statistical inference.

Central Limit Theorem

Let’s consider a sample with size $n$ from a large population which has average $\mu$ and variance $\sigma^2$. If the sample size $n$ is enough large, the distribution of the average of the samples $\bar{X}$ obeys a Normal distribution with average $\mu$, and variance $\sigma^2/n$.

This theorem has two important points.

1. Although population does not follow a Normal Distribution, the average of the sample follows a Normal Distribution. However, if the distribution tail of the population is fat (e.g. power-law distribution) and the average or/and standard deviation in the population become infinite, then theorem does not hold.

2. If we consider about standard deviation, the standard deviation of sample average $\bar{X}$ is $1/\sqrt{n}$ of population standard deviation $\sigma$. This means that by increasing the sample size $n$, the deviation between the sample average $\bar{X}$ and population average $\mu$ decreases. (By increasing the sample size $n$, the accuracy increases.)

If the population has infinite average or variance, the central limit theorem does not hold. In this case, we can use another theorem that involves the stable distribution. Did you know the case that the population distribution has infinite average or variance? A typical example is when the tail of the distribution follows a power-law, which is also called a fat-tailed ditribution.

Let us consider the case when the tail of the population distribution follows a power-law $f(k)\propto k^{-\gamma}$ with exponent $\gamma$.

(1) When the exponent $\gamma > 3$, the population average and variance are finite and the central limite theorem holds.

(2) When the exponent $3> \gamma > 2$, the population average is finite, but variance is infinite. In this case, the central limit theorem can not be applied.

(3) When the exponent $2> \gamma > 1$, both the population average and variance are infinite and the central limite theorem can not be applied either.