Student’s t-distribution

Pocket

In this section, we will introduce the t-distribution (Student’s t-distribution). This distribution is used to the interval estimation for the computation of the population mean when the population variance is unknown and sample size is small (less than 30 observations or elements).

The Student’s t -distribution is defined by

\displaystyle f(t)=\frac{\Gamma(\frac{m+1}{2})}{\sqrt{m\pi}\Gamma(\frac{m}{2})}\Bigl(1+\frac{t^2}{m}\Bigr)^{-\frac{m+1}{2}}

Here, \Gamma(x) denotes the Gamma function and  m indicates the number of degrees of freedom.

As we will see later, the number of degrees of freedom m is related to the sample size n in the interval estimation. Although this equation seems complex and difficult, we will not use this definition so often. Instead, we will use the table of t-distribution or any of the available computational software for statistics such as excel.

The Student’s t-distribution becomes close to normal distribution, when the degrees of freedom m increases. In fact, when  the degrees of freedom becomes enough large (m>30), the t-distribution is very similar to the normal distribution. This highlights the main difference between previously introduced pattern 2 and pattern 3 in interval estimation. Furthermore, when the degrees of freedom m decreases, the tail of t-distribution becomes fatter than normal distribution in both plus(right hand side) and minus(left hand side) ranges.

In what follows, we will explain an important theorem for the t-distribution.

Let \mu be the population average, \bar{X} be the sample average, s be the sample standard deviation, and n be the sample size.  Then, we can prove that the value defined by

\displaystyle t=\frac{\bar{X}-\mu}{\frac{s}{\sqrt{n}}}

is equivalent to the t-distribution with degrees of freedom m=n-1.

This theorem is applied to the pattern 3  (i.e. the interval estimation for the computation of the population mean when the population variance is unknown and sample size is small).

We have explained this before, but we note again that when sample n size becomes large (more than 30), this value t is close to the normal distribution. This is the case of pattern 2  (i.e. the interval estimation for the computation of the population mean when the population variance is unknown and sample size is large)

In next page, we will explain pattern 3 using this theorem. We will then be ready to provide a concrete numerical example to understand the real usage of this pattern for real-world interval estimation.

Comments are closed.