This is an archived course. A more recent version may be available at ocw.mit.edu.

3.2 Review of Probability and Statistics

3.2.5 The Central Limit Theorem

Measurable Outcome 3.7, Measurable Outcome 3.9

In the first four units, we have introduced the concept of a random variable, the basic axioms of probability and probability distribution functions. We have seen that just like variables can be characterized by their value, random variables can be characterized by their probability distribution. We have also seen that we can have random variables that are functions of other random variables, just like deterministic variables can be functions of other deterministic variables.

Sums of Random Variables

It is natural to ask whether there are more parallel properties that random variables and deterministic variables share. For example, we know that the sum of two deterministic variables is another deterministic variable. What can we say about the sum of two random variables? Quite obviously, the sum of two random variables \(X\) and \(Y\) is a new random variable,

\[Z=Z(X,Y)=X+Y\] (3.17)

When we are adding two deterministic variables, we just have to add their values. What does it mean to add two random variables ?

Since \(Z\) is a random variable, it is characterized by its probability distribution, and this probability distribution has to depend on the distributions of \(X\) and \(Y\). Suppose \(X\) and \(Y\) are independent, continuous random variables, and have densities \(f_ X\) and \(f_ Y\). Then the density \(f_ Z\) is given by the convolution,

\[f_ Z(z) = \int \limits _{-\infty }^{+\infty } f_{X}(x)f_{Y}(z-x) \, dx\] (3.18)

Thus, it is indeed possible to think about adding random variables, and the result of such an addition gives us the density of a new random variable via a convolution.

Examples of Sums of Two Random Variables

As a first example, suppose \(X_1 \sim \mathcal{U}(0,1)\) and \(X2 \sim \mathcal{U}(0,1)\), then the convolution above results in a density for a triangularly distributed random variable.

\[f_{Z}(z) = \left\{ \begin{array}{ll} z & \mbox{for } 0 \leq z \leq 1, \\[0.1in] 2 - z & \mbox{for } 1 \leq z \leq 2, \\[0.1in] 0 & \mbox{otherwise } \end{array}\right.\] (3.19)

The sum of these two uniformly distributed random variables \(X_1\) and \(X_2\) is a triangularly distributed random variable. Now suppose we add another random variable \(X3 \sim \mathcal{U}(0,1)\) to this sum, we again get a new random variable, but calculating its density using the convolution is not easy.

However, there is an important exception. Suppose \(X_1 \sim \mathcal{N}(1,1)\) and \(X_2 \sim \mathcal{N}(1,1)\), then the convolution above results in a density for a normally distributed random variable \(Z \sim \mathcal{N}(2,2)\). So, the sum of two normally distributed random variables is again a normal random variable. The mean and variance of this sum are the sum of means and variances of the normals we added.

Further, if we average this sum we have,

\[\frac{Z}{2} = \frac{X_1 + X_2}{2} \sim \mathcal{N}(1,\frac{1}{2})\] (3.20)

If we now add \(X_3 \sim \mathcal{N}(1,1)\), and average over all three, we again get a normally distributed random with density \(\mathcal{N}(1,\frac{1}{3})\), and so on. Adding \(N\) independent, normal random variables with density \(\mathcal{N}(1,1)\) we obtain,

\[\frac{X_1 + X_2 + X_3 + .... + X_{N-1} + X_ N}{N} \sim \mathcal{N}(1, \frac{1}{N})\] (3.21)

Since this is true, no matter how large we make \(N\), we can say that it holds true as \(N \to \infty\).

The Central Limit Theorem

We saw above that for the particular case of independent, identically and normally distributed variables, the asymptotic average of these variables remains normally distributed. The mean of this asymptotic average is the average of means of each variable, and its variance tends to zero as \(\frac{1}{N}\). Remarkably, this is property is true for the asymptotic average of any independent and identically distributed random variables, even if they are not normally distributed !

This is the basic result of the Central Limit Theorem, which tells us that if \(X_1\), \(X_2\), ... \(X_ N\) are independent, identically distributed random variables, with \(E(X_1) = E(X_2) = ... E(X_ N) = \mu\) and \(Var(X_1) = Var(X_2) = ... Var(X_ N) = \sigma ^2\), then we have

\[\frac{\sum \limits _{i=1}^{N} X_ i}{N} - \mu \to \mathcal{N}(0, \frac{\sigma ^{2}}{N})\] (3.22)

as \(N \to \infty\). So instead of averaging normals, if we kept averaging the uniform random variables that we added earlier, then eventually, the resultant random variable is guaranteed to have a normal distribution ! Not only that, the mean of this normal random variable will have the mean of the uniform random variables we added, and its variance will tend to zero.

Let \(X_1\), \(X_2\), \(X_3\), \(\ldots\) , \(X_ N\) be i.i.d variables with \(E(X_1)=E(X_2)=\ldots =E(X_ n)=\mu\). Then the means of the random variables \(X_1-\mu\), \(X_2-\mu\), \(X_3-\mu\), \(\ldots\) , \(X_ N-\mu\) are

Exercise 1

Answer:

\(E(X_1 - \mu ) = \int _{-\infty }^{+\infty } (x - \mu ) f_ X(x)\, dx = \int _{-\infty }^{+\infty } x f_ X(x)\, dx - \int _{-\infty }^{+\infty } \mu f_ X(x)\, dx = \mu - \mu \int _{-\infty }^{+\infty } f_ X(x)\, dx = \mu - \mu = 0\)

Further, let \(Var(X_1)=Var(X_2)=\ldots =Var(X_ N)=\sigma ^2\). Then as \(N \to \infty\) we have,

Exercise 2

Answer:

Direct application of the Central Limit Theorem.