Here I'll provide a (potentially wrong) proof for the pdf of chi-squared Distribution. The proof will be elementary, i.e. it doesn't use sophisticated mechanisms or deep results unless truly necessary. Hence the proof might be rather long.
Also, I don't use standard notations, so it might be painful to some eyes. Additionally, notations might not even be consistent, so it might also cause motion sickness.
Definition of Chi Squared Distribution:
Given \( X_1, X_2, \cdots, X_k \) drawn independently from standard normal distribution \( N(0, 1) \), then the random variable \( Y = X_1^2 + X_2^2 + \cdots + X_k^2 \) follows the chi-squared distribution with k degrees of freedom.
This distribution naturally arises when we want to study the distribution of the sample variance. The distribution itself is going to be useful later on for defining t-distribution... Ok, I'm not good at explaining what motivates the use of something, but I guess the Wikipedia page does a better job.
From the Wikipedia page, adapted here, it is claimed that the pdf of chi-squared distribution with k degrees of freedom is
\[ f(x) = \frac{x^{k/2 - 1} e^{-x/2} } {2^{k/2} \Gamma(k/2)} \text{ for x > 0. } \]
(For other values of \( x \), \( f(x) = 0 \)).
First reaction when I saw it: holyshit.
Second reaction, which maybe quite common: how do I prove it?
Fortunately, I found some ways, sort of. Spoiler alert: we gonna prove by induction.
Proof for k = 1
This one is easier and follows from studying the cdf of \( Y = X^2 \), \(X \sim N(0, 1) \).
$$\begin{align} P(Y < y) &= P(X^2 < y) \\ &= P(\sqrt{y} < X < \sqrt{y}) \\ &= 2\int_{0}^{\sqrt{y}} \frac{1}{\sqrt{2 \pi }} e^{-x^2/2} dx \end{align}$$
Substitute \( u = x^2 \), and noting that \( du = 2x dx \) or equivalently \( du = 2 \sqrt{u} dx \) (since \(u, x > 0 \) ):
$$\begin{align}2\int_{0}^{\sqrt{y}} \frac{1}{\sqrt{2 \pi }} e^{-x^2/2} dx &= \int_{0}^y \frac{1}{\sqrt{2 \pi }} u^{-1/2} e^{-u/2} du \end{align}$$
Hence we can conclude that the pdf of \(Y = X^2\) is
\[ \frac{1}{\sqrt{2 \pi }} x^{-1/2} e^{-x/2} \].
Since \( \Gamma(1/2) = \sqrt{\pi} \), we can rewrite the pdf as
\[ \frac{x^{-1/2} e^{-x/2}}{2^{1/2} \Gamma(1/2) } \]
as required. \( \square \)
Proof for k > 1
This is the induction step. Suppose up to this point we have proven up to \(k \) that the pdf of \(Y_k = X_1^2 + X_2^2 + \cdots + X_k^2 \) is
\[ \frac{x^{k/2 - 1} e^{-x/2} } {2^{k/2} \Gamma(k/2)} \].
We want to appeal to the Induction God that indeed it the pdf for \( Y_{k+1} = X_1^2 + X_2^2 + \cdots + X_{k+1}^2 \) is
\[ \frac{x^{(k+1)/2 - 1} e^{-x/2} } {2^{(k+1)/2} \Gamma((k+1)/2)} \].
Before we proceed, let me make 2 bold assertions here.
1. that I can rewrite \(Y_{k+1} = Y_k + X_{k+1}^2 \). I'm not too sure if this is allowed, but if this isn't, then you can stop reading here because whatever I write after this point is then just bullshit.
2. that we can apply the convolution of pdfs to evaluate the pdf of the sum of random variables, i.e. that \( f_{X+Y}(z) = \int_{-\infty}^{\infty} f_Y(z-x) f_X(x) dx \).
Side track: WTF is convolution
About the second assertion regarding convolution, while it sounds like something sophisticated, it actually can be derived using elementary argument as follows. Suppose \( Z = X + Y \) are random variables, and that \(X\) and \(Y\) are independent. Then
$$\begin{align} P(Z < z) &= P(X + Y < z) \\ &= \int_{-\infty}^{\infty} \int_{-\infty}^{z - x} f_Y(y) dy \, f_X(x) dx \end{align}$$.
The inner integral \( \int_{-\infty}^{z - x} f_Y(y) dy \) has \(x\) fixed, so if we treat it like a constant and substitute \( y = t - x \), we get \( \int_{-\infty}^{z} f_Y(t - x) dt \). So:
$$\begin{align} P(Z < z) &= \int_{-\infty}^{\infty} \int_{-\infty}^{z} f_Y(t-x) f_X(x) dt dx \\ &= \int_{-\infty}^{z}\int_{-\infty}^{\infty} f_Y(t-x) f_X(x) dx dt \end{align}$$.
Which means the pdf of \(Z\) is given by \(f_Z(t) = \int_{-\infty}^{\infty} f_Y(t-x) f_X(x) dx \).
Back to Induction argument
If you are convinced, then we may proceed as follows: let \(X = Y_k \) and \( Y = X_{k+1}^2 \), then the pdf of \(Z = Y + X \) is given by:
\[\begin{align} f_Z(z) &= \int_{-\infty}^{\infty} f_Y(z-x) f_X(x) dx \\ &= \int_{-\infty}^{0} f_Y(z-x) f_X(x) dx + \int_{0}^{z} f_Y(z-x) f_X(x) dx + \int_{z}^{\infty} f_Y(z-x) f_X(x) dx \\ &= 0 + \int_{0}^{z} f_Y(z-x) f_X(x) dx + 0 \\ &= \int_{0}^{z} f_Y(z-x) f_X(x) dx \end{align}\]
\(Y\) is the chi-squared distribution with 1 degree of freedom, while \(X\) is the chi-squared distribution with k degrees of freedom. Hence we can substitute in the pdfs by the induction hypothesis above to get:
\[\begin{align} f_Z(z) &= \int_{0}^{z} f_Y(z-x) f_X(x) dx \\ &= \int_{0}^{z} \frac{x^{k/2-1}e^{-x/2}}{2^{k/2}\Gamma(k/2)} \frac{(z-x)^{1/2}e^{-(z-x)/2}}{2^{1/2}\Gamma(1/2)} dx \\ &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{z} \frac{x^{k/2-1}}{\sqrt{z-x}} dx \end{align}\]
Substitute \(x = zu\) we get
\[\begin{align} f_Z(z) &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{1} \frac{(zu)^{k/2-1}}{\sqrt{z}\sqrt{1-u}} z \, du \\ &= \frac{z^{(k+1)/2}e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} \int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du \end{align}\]
Let's now focus on the integral
\[\int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du\]
Substitute \(u = \sin^2{\theta}\) we have
\[\begin{align} \int_{0}^{1} \frac{u^{k/2-1}}{\sqrt{1-u}}\, du &= 2\int_{0}^{\pi/2} \sin^{k-1}\, d\theta \\ &= 2I_{k-1} \end{align}\]
where \(I_k = \int_{0}^{\pi/2} \sin^k \, d\theta \).
\(I_k\) is reminds me a lot of some A-level integration exercise. By doing integration by parts on \(I_k\) where we set \(u' = \sin\theta \) and \(v = \sin^{k-1} \theta\), you probably can also come to the conclusion that
\[I_k = \frac{k-1}{k} I_{k-2} \].
Then we make the following 2 observations:
1. If \(k\) is odd,
\[I_k = \frac{k-1}{k}\frac{k-3}{k-2}\cdots\frac{2}{3}\].
2. If \(k\) is even,
\[I_k = \frac{k-1}{k}\frac{k-3}{k-2}\cdots\frac{1}{2}\frac{\pi}{2}\].
So now we are interested to prove the following lemma, which will complete the proof for the pdf:
\[\frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} = I_{k-1}\frac{2}{\sqrt{\pi}}\]
Proof:
Using the fact that \(\Gamma(x+1) = x\Gamma(x)\), we have:
\[ \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} = \frac{k-2}{k-1}\frac{\Gamma(\frac{k-2}{2})}{\Gamma(\frac{k-1}{2})} \]
Also as a reminder, \( \Gamma(1) = 1 \) and \(\Gamma(1/2) = \sqrt(\pi)\).
We have 2 cases:
1. If \(k\) is odd, we have
\[\begin{align} \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \frac{\Gamma(\frac{1}{2})}{\Gamma(\frac{2}{2})} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \sqrt{\pi} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{1}{2} \frac{\pi}{2} \frac{2}{\sqrt{\pi}} \\ &= I_{k-1} \frac{2}{\sqrt{\pi}} \end{align}\]
2. If \(k\) is even, we have
\[\begin{align} \frac{\Gamma(\frac{k}{2})}{\Gamma(\frac{k+1}{2})} &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{\Gamma(\frac{2}{2})}{\Gamma(\frac{3}{2})} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{1}{\Gamma(\frac{1}{2})\frac{1}{2}} \\ &= \frac{k-2}{k-1} \frac{k-4}{k-3} \cdots \frac{2}{3} \frac{2}{\sqrt{\pi}} \\ &= I_{k-1} \frac{2}{\sqrt{\pi}} \end{align}\]
as desired. \( \square\)
So finally,
\[\begin{align} f_Z(z) &= \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma(k/2)\sqrt{\pi}} 2I_{k-1} = \frac{e^{-z/2}}{2^{(k+1)/2}\Gamma((k+1)/2)} \end{align}\]
and we are done! phew!
Having proven this, I can now use this pdf in peace.
Comments
Post a Comment