Chi-Squared Distribution Proof (Elementary)

Here I'll provide a (potentially wrong) proof for the pdf of chi-squared Distribution. The proof will be elementary, i.e. it doesn't use sophisticated mechanisms or deep results unless truly necessary. Hence the proof might be rather long.

Also, I don't use standard notations, so it might be painful to some eyes. Additionally, notations might not even be consistent, so it might also cause motion sickness.

Definition of Chi Squared Distribution:

Given $ X_1, X_2, \cdots, X_k $ drawn independently from standard normal distribution $ N(0, 1) $, then the random variable $ Y = X_1^2 + X_2^2 + \cdots + X_k^2 $ follows the chi-squared distribution with k degrees of freedom.

This distribution naturally arises when we want to study the distribution of the sample variance. The distribution itself is going to be useful later on for defining t-distribution... Ok, I'm not good at explaining what motivates the use of something, but I guess the Wikipedia page does a better job.

From the Wikipedia page, adapted here, it is claimed that the pdf of chi-squared distribution with k degrees of freedom is

\[ f(x) = \frac{x^{k/2 - 1} e^{-x/2} } {2^{k/2} \Gamma(k/2)} \text{ for x > 0. } \]

(For other values of $ x $, $ f(x) = 0 $).

First reaction when I saw it: holyshit.

Second reaction, which maybe quite common: how do I prove it?

Fortunately, I found some ways, sort of. Spoiler alert: we gonna prove by induction.

Proof for k = 1

This one is easier and follows from studying the cdf of $ Y = X^2 $, $X \sim N(0, 1) $.

$$\begin{align} P(Y < y) &= P(X^2 < y) \\ &= P(\sqrt{y} < X < \sqrt{y}) \\ &= 2\int_{0}^{\sqrt{y}} \frac{1}{\sqrt{2 \pi }} e^{-x^2/2} dx \end{align}$$

Substitute $ u = x^2 $, and noting that $ du = 2x dx $ or equivalently $ du = 2 \sqrt{u} dx $ (since $u, x > 0 $ ):

$$\begin{align}2\int_{0}^{\sqrt{y}} \frac{1}{\sqrt{2 \pi }} e^{-x^2/2} dx &= \int_{0}^y \frac{1}{\sqrt{2 \pi }} u^{-1/2} e^{-u/2} du \end{align}$$

Hence we can conclude that the pdf of $Y = X^2$ is

\[ \frac{1}{\sqrt{2 \pi }} x^{-1/2} e^{-x/2} \].

Since $ \Gamma(1/2) = \sqrt{\pi} $, we can rewrite the pdf as

\[ \frac{x^{-1/2} e^{-x/2}}{2^{1/2} \Gamma(1/2) } \]

as required. $ \square $

Proof for k > 1

This is the induction step. Suppose up to this point we have proven up to $k $ that the pdf of $Y_k = X_1^2 + X_2^2 + \cdots + X_k^2 $ is

\[ \frac{x^{k/2 - 1} e^{-x/2} } {2^{k/2} \Gamma(k/2)} \].

We want to appeal to the Induction God that indeed it the pdf for $ Y_{k+1} = X_1^2 + X_2^2 + \cdots + X_{k+1}^2 $ is

\[ \frac{x^{(k+1)/2 - 1} e^{-x/2} } {2^{(k+1)/2} \Gamma((k+1)/2)} \].

Before we proceed, let me make 2 bold assertions here.

1. that I can rewrite $Y_{k+1} = Y_k + X_{k+1}^2 $. I'm not too sure if this is allowed, but if this isn't, then you can stop reading here because whatever I write after this point is then just bullshit.

2. that we can apply the convolution of pdfs to evaluate the pdf of the sum of random variables, i.e. that $ f_{X+Y}(z) = \int_{-\infty}^{\infty} f_Y(z-x) f_X(x) dx $.

Side track: WTF is convolution

About the second assertion regarding convolution, while it sounds like something sophisticated, it actually can be derived using elementary argument as follows. Suppose $ Z = X + Y $ are random variables, and that $X$ and $Y$ are independent. Then

$$\begin{align} P(Z < z) &= P(X + Y < z) \\ &= \int_{-\infty}^{\infty} \int_{-\infty}^{z - x} f_Y(y) dy \, f_X(x) dx \end{align}$$.

The inner integral $ \int_{-\infty}^{z - x} f_Y(y) dy $ has $x$ fixed, so if we treat it like a constant and substitute $ y = t - x $, we get $ \int_{-\infty}^{z} f_Y(t - x) dt $. So:

$$\begin{align} P(Z < z) &= \int_{-\infty}^{\infty} \int_{-\infty}^{z} f_Y(t-x) f_X(x) dt dx \\ &= \int_{-\infty}^{z}\int_{-\infty}^{\infty} f_Y(t-x) f_X(x) dx dt \end{align}$$.

Which means the pdf of $Z$ is given by $f_Z(t) = \int_{-\infty}^{\infty} f_Y(t-x) f_X(x) dx $.

Back to Induction argument

If you are convinced, then we may proceed as follows: let $X = Y_k $ and $ Y = X_{k+1}^2 $, then the pdf of $Z = Y + X $ is given by:

\[\begin{align} f_Z(z) &= \int_{-\infty}^{\infty} f_Y(z-x) f_X(x) dx \\ &= \int_{-\infty}^{0} f_Y(z-x) f_X(x) dx + \int_{0}^{z} f_Y(z-x) f_X(x) dx + \int_{z}^{\infty} f_Y(z-x) f_X(x) dx \\ &= 0 + \int_{0}^{z} f_Y(z-x) f_X(x) dx + 0 \\ &= \int_{0}^{z} f_Y(z-x) f_X(x) dx \end{align}\]

$Y$ is the chi-squared distribution with 1 degree of freedom, while $X$ is the chi-squared distribution with k degrees of freedom. Hence we can substitute in the pdfs by the induction hypothesis above to get:

\[\begin{align} f_Z(z) &= \int_{0}^{z} f_Y(z-x) f_X(x) dx \\ &= \int_{0}^{z} \frac{x^{k/2-1}e^{-x/2}}{2^{k/2}\Gamma(k/2)} \frac{(z-x)^{1/2}e^{-(z-x)/2}}{2^{1/2}\Gamma(1/2)} dx \\ &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{z} \frac{x^{k/2-1}}{\sqrt{z-x}} dx \end{align}\]

Substitute $x = zu$ we get

\[\begin{align} f_Z(z) &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{1} \frac{(zu)^{k/2-1}}{\sqrt{z}\sqrt{1-u}} z \, du \\ &= \frac{z^{(k+1)/2}e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du \end{align}\]

Let's now focus on the integral

\[\int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du\]

Substitute $u = \sin^2{\theta}$ we have

\[\begin{align} \int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du &= 2\int_{0}^{\pi/2} \sin^{k-1}\, d\theta \\ &= 2I_{k-1} \end{align}\]

where $I_k = \int_{0}^{\pi/2} \sin^k \, d\theta $.

$I_k$ is reminds me a lot of some A-level integration exercise. By doing integration by parts on $I_k$ where we set $u' = \sin\theta $ and $v = \sin^{k-1} \theta$, you probably can also come to the conclusion that

\[I_k = \frac{k-1}{k} I_{k-2} \].

Then we make the following 2 observations:

1. If $k$ is odd,

\[I_k = \frac{k-1}{k}\frac{k-3}{k-2}\cdots\frac{2}{3}\].

2. If $k$ is even,

\[I_k = \frac{k-1}{k}\frac{k-3}{k-2}\cdots\frac{1}{2}\frac{\pi}{2}\].

So now we are interested to prove the following lemma, which will complete the proof for the pdf:

\[\frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} = I_{k-1}\frac{2}{\sqrt{\pi}}\]

Proof:

Using the fact that $\Gamma(x+1) = x\Gamma(x)$, we have:

\[ \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} = \frac{k-2}{k-1}\frac{\Gamma(\frac{k-2}{2})}{\Gamma(\frac{k-1}{2})} \]

Also as a reminder, $ \Gamma(1) = 1 $ and $\Gamma(1/2) = \sqrt(\pi)$.

We have 2 cases:

1. If $k$ is odd, we have

\[\begin{align} \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \frac{\Gamma(\frac{1}{2})}{\Gamma(\frac{2}{2})} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \sqrt{\pi} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \frac{\pi}{2} \frac{2}{\sqrt{\pi}} \\ &= I_{k-1} \frac{2}{\sqrt{\pi}} \end{align}\]

2. If $k$ is even, we have

\[\begin{align} \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{\Gamma(\frac{2}{2})}{\Gamma(\frac{3}{2})} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{1}{\Gamma(\frac{1}{2})\frac{1}{2}} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{2}{\sqrt{\pi}} \\ &= I_{k-1} \frac{2}{\sqrt{\pi}} \end{align}\]

as desired. $ \square$

So finally,

\[\begin{align} f_Z(z) &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} 2I_{k-1} = \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma((k+1)/2)} \end{align}\]

and we are done! phew!

Having proven this, I can now use this pdf in peace.

Maths by Chance

Search This Blog