Skip to main content

Continuous Distributions 3

3. χ2\chi^2 distributions

3.1 Definition

Let XX be a continuous random variable.

By definition, XX is said to follow the χ2\chi^2 distribution with kk degrees of freedom if

XΓ(k2,12)=χk2X\sim \Gamma(\frac{k}{2},\frac{1}{2})=\chi^2_k

3.2 Significance

For kN,k\in\mathbb{N}, The χk2\chi^2_k distribution is the distribution of the sum of square of kk independent standard normal random variables.

The chi-squared distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals.

The chi-squared distribution is used in the common chi-squared tests for goodness of fit of an observed distribution to a theoretical one, the independence of two criteria of classification of qualitative data, and in confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation. Many other statistical tests also use this distribution, such as Friedman's analysis of variance by ranks.

3.3 Square of a standard normal random variable

Let XN(0,1),X\in\mathcal{N}(0,1), and let Y=X2Y=X^2

xR+,FY(x)=P(X2<x)=P(x<X<x)=12πxxet22dt=2π0xet22dt    xR+,fY(x)=FY(x)=12x(2πe(x)22)=12πxex2\begin{align*} \forall x\in\mathbb{R}_+^*,F_Y(x)&=\mathcal{P}(X^2<x)\\ &=\mathcal{P}(-\sqrt{x}<X<\sqrt{x})\\ &=\frac{1}{\sqrt{2\pi}}\int_{-\sqrt{x}}^{\sqrt{x}}e^{-\frac{t^2}{2}}\text{dt}\\ &=\frac{\sqrt{2}}{\sqrt{\pi}}\int_{0}^{\sqrt{x}}e^{-\frac{t^2}{2}}\text{dt}\\ \implies\forall x\in\mathbb{R}_+^*,f_Y(x)&=F_Y'(x)\\ &=\frac{1}{2\sqrt{x}}\cdot\left(\frac{\sqrt{2}}{\sqrt{\pi}}e^{-\frac{(\sqrt x)^2}{2}}\right)\\ &=\frac{1}{\sqrt{2\pi x}}e^{-\frac{x}{2}} \end{align*}

x0,\forall x\leq0, it is trivial that FY(x)=0,F_Y(x)=0, So consequently x0,fY(x)=0\forall x\leq 0,f_Y(x)=0

So we can conclude that:

X2Γ(12,12)=χ12\boxed{X^2\sim\Gamma(\frac{1}{2},\frac{1}{2})=\chi^2_1}

3.4 Sum of squares of independent standard normal random variables

  • Let nNn\in\mathbb{N}^*
  • Let X1,,XnN(0,1)X_1,\dots,X_n \sim \mathcal{N}(0,1) be independent standard normal random variables
i=0nXi2Γ(n2,12)=χn2\boxed{\sum_{i=0}^nX_i^2\sim \Gamma(\frac{n}{2},\frac{1}{2})=\chi^2_n}

This follows immediately from the sum of gamma distributions.

3.5 Sum of chi-square distributions

  • Let nNn\in\mathbb{N}^*
  • Let d1,,dnN,d_1,\dots,d_n\in\mathbb{N}^*, and let r=i=1ndir=\sum_{i=1}^nd_i
  • Let X1χd12,,Xnχdn2X_1\sim \chi^2_{d_1},\dots,X_n\sim \chi^2_{d_n}:
i=1nXiχr2\boxed{\sum_{i=1}^nX_i\sim \chi^2_{r}}

This follows immediately from the sum of gamma distributions.

3.6 Moment

Let Xχn2X\sim\chi_n^2

As a χ2\chi^2-distribution is a Γ\Gamma-distribution, the calculation of the moments can be found here

We will list here essentially the expected value and the variance.

3.6.1 Expected Value

E[X]=n212=n\mathbb{E}[X]=\frac{\frac{n}{2}}{\frac{1}{2}}=n

3.6.2 Variance

V[X]=n2122=2n\mathbb{V}[X]=\frac{\frac{n}{2}}{\frac{1}{2^2}}=2n

4. F\mathcal{F}- distributions

4.1 Definition

  • Let d1d2Nd_1\,d_2 \in\mathbb{N}^*
  • XX a continuous random variable

By definition, we say that XX follows the FF distribution with parameters (d1,d2)(d_1,d_2) if there exists X1χd12,X2χd22X_1\sim\chi^2_{d_1},X_2\sim \chi^2_{d_2} such that X1,X2X_1,X_2 are independents and:

X=X1d1X2d2X=\frac{\tfrac{X_1}{d_1}}{\tfrac{X_2}{d_2}}

By definition:

d1,d2N,X1χd12,X2χd22 independents:X1d1X2d2F(d1,d2)\boxed{\forall d_1,d_2\in\mathbb{N}^*,\quad\forall X_1\sim\chi^2_{d_1},\forall X_2\sim\chi^2_{d_2} \text{ independents}:\quad \frac{\tfrac{X_1}{d_1}}{\tfrac{X_2}{d_2}}\sim\mathcal{F}(d_1,d_2)}

4.2 Significance

the FF-distribution arises frequently as the null distribution of a test statistic, most notably in the analysis of variance (ANOVA) and other F-tests.

A random variate of the F-distribution with parameters d1d_1 and d2d_2 arises as the ratio of two appropriately scaled chi-squared variates with respective degree of freedoms d1d_1 and d2d_2.

4.3 Probability Distribution Function

Let d1,d2Nd_1,d_2\in\mathbb{N}^*

We have χd12,χd22>0\chi_{d_1}^2,\chi_{d_2}^2> 0, So:

xR+,fF(d1,d2)(x)=R+tfχd12/d1(xt)fχd22/d2(t)dt=R+d1d2tfχd12(xd1t)fχd22(d2t)dt=d1d2R+tfχd12(xd1t)fχd22(d2t)dt=d1d2R+td1d121td121xd121ed1x2td2d221td221ed22t2d1+d22Γ(d12)Γ(d22)dt=d1d12d2d22xd1212d1+d22Γ(d12)Γ(d22)R+td1+d221ed1x+d22tdt=d1d12d2d22xd1212d1+d22Γ(d12)Γ(d22)R+(2d1x+d2)d1+d22ud1+d221eudu with u=2td1x+d2=d1d12d2d22xd121(d1x+d2)d1+d22Γ(d12)Γ(d22)R+ud1+d221eudu=d1d12d2d22xd121(d1x+d2)d1+d22Γ(d12)Γ(d22)Γ(d1+d22)=d1d12d2d22xd121(d1x+d2)d1+d22B(d12,d22)=1xB(d12,d22)(d1x)d1d2d2(d1x+d2)d1+d2\begin{align*} \forall x \in \mathbb{R}_+^*,f_{\mathcal{F}(d_1,d_2)}(x)&=\int_{\mathbb{R}_+^*}tf_{\chi^2_{d_1}/d_1}(xt)f_{\chi^2_{d_2}/d_2}(t)\text{dt}\\ &=\int_{\mathbb{R}_+^*}d_1d_2tf_{\chi^2_{d_1}}(xd_1t)f_{\chi^2_{d_2}}(d_2t)\text{dt}\\ &=d_1d_2\int_{\mathbb{R}_+^*}tf_{\chi^2_{d_1}}(xd_1t)f_{\chi^2_{d_2}}(d_2t)\text{dt}\\ &=d_1d_2\int_{\mathbb{R}_+^*}t\frac{d_1^{\tfrac{d_1}{2}-1}t^{\tfrac{d_1}{2}-1}x^{\tfrac{d_1}{2}-1}e^{-\tfrac{d_1x}{2}t}d_2^{\tfrac{d_2}{2}-1}t^{\tfrac{d_2}{2}-1}e^{-\tfrac{d_2}{2}t}}{2^{\tfrac{d_1+d_2}{2}}\Gamma(\tfrac{d_1}{2})\Gamma(\tfrac{d_2}{2})}\text{dt}\\ &=\frac{d_1^{\tfrac{d_1}{2}}d_2^{\tfrac{d_2}{2}}x^{\tfrac{d_1}{2}-1}}{2^{\tfrac{d_1+d_2}{2}}\Gamma(\tfrac{d_1}{2})\Gamma(\tfrac{d_2}{2})}\int_{\mathbb{R}_+^*}t^{\tfrac{d_1+d_2}{2}-1}e^{-\tfrac{d_1x+d_2}{2}t}\text{dt}\\ &=\frac{d_1^{\tfrac{d_1}{2}}d_2^{\tfrac{d_2}{2}}x^{\tfrac{d_1}{2}-1}}{2^{\tfrac{d_1+d_2}{2}}\Gamma(\tfrac{d_1}{2})\Gamma(\tfrac{d_2}{2})}\int_{\mathbb{R}_+^*}\left(\frac{2}{d_1x+d_2}\right)^{\tfrac{d_1+d_2}{2}}u^{\tfrac{d_1+d_2}{2}-1}e^{-u}\text{du} \text{ with }u=\frac{2t}{d_1x+d_2}\\ &=\frac{d_1^{\tfrac{d_1}{2}}d_2^{\tfrac{d_2}{2}}x^{\tfrac{d_1}{2}-1}}{\left(d_1x+d_2\right)^{\tfrac{d_1+d_2}{2}}\Gamma(\tfrac{d_1}{2})\Gamma(\tfrac{d_2}{2})}\int_{\mathbb{R}_+^*}u^{\tfrac{d_1+d_2}{2}-1}e^{-u}\text{du}\\ &=\frac{d_1^{\tfrac{d_1}{2}}d_2^{\tfrac{d_2}{2}}x^{\tfrac{d_1}{2}-1}}{\left(d_1x+d_2\right)^{\tfrac{d_1+d_2}{2}}\Gamma(\tfrac{d_1}{2})\Gamma(\tfrac{d_2}{2})}\Gamma\left(\frac{d_1+d_2}{2}\right)\\ &=\frac{d_1^{\tfrac{d_1}{2}}d_2^{\tfrac{d_2}{2}}x^{\tfrac{d_1}{2}-1}}{\left(d_1x+d_2\right)^{\tfrac{d_1+d_2}{2}}\Beta(\tfrac{d_1}{2},\tfrac{d_2}{2})}\\ &=\frac{1}{x\Beta(\tfrac{d_1}{2},\tfrac{d_2}{2})}\sqrt{\frac{(d_1x)^{d_1}d_2^{d_2}}{(d_1x+d_2)^{d_1+d_2}}} \end{align*}

5. Sudent's tt-distribution

5.1 Definition

In probability and statistics, Student's tt-distribution (or simply the tt-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situations where the sample size is small and the population's standard deviation is unknown. It was developed by English statistician William Sealy Gosset under the pseudonym "Student".

  • Let d1d2Nd_1\,d_2 \in\mathbb{N}^*
  • XX a continuous random variable

By definition, we say that XX follows the tt distribution with ν\nu degrees of freedom if there exists PN(0,1),Sχν2P\sim\mathcal{N}(0,1),S\sim \chi^2_{\nu} such that X,SX,S are independents and:

X=PSνX=\frac{P}{\sqrt{\tfrac{S}{\nu}}}

By definition:

νN,PN(0,1),Sχν2 independents:PSνT(ν)\boxed{\forall \nu\in\mathbb{N}^*,\quad\forall P\sim\mathcal{N}(0,1),\forall S\sim\chi^2_{\nu} \text{ independents}:\quad \frac{P}{\sqrt{\tfrac{S}{\nu}}}\sim\mathcal{T}(\nu)}

5.2 Significance

The tt-distribution plays a role in a number of widely used statistical analyses, including Student's t-test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means, and in linear regression analysis. Student's t-distribution also arises in the Bayesian analysis of data from a normal family.

If we take a sample of nn observations from a normal distribution, then the tt-distribution with ν=n1\nu=n-1 degrees of freedom can be defined as the distribution of the location of the sample mean relative to the true mean, divided by the sample standard deviation, after multiplying by the standardizing term n\sqrt{n}. In this way, the t-distribution can be used to construct a confidence interval for the true mean.

The tt-distribution is symmetric and bell-shaped, like the normal distribution. However, the tt-distribution has heavier tails, meaning that it is more prone to producing values that fall far from its mean. This makes it useful for understanding the statistical behavior of certain types of ratios of random quantities, in which variation in the denominator is amplified and may produce outlying values when the denominator of the ratio falls close to zero. The Student's tt-distribution is a special case of the generalised hyperbolic distribution.

5.3 Probability Distribution Function

xR,fT(ν)(x)=Γ(n+12)nπΓ(n2)(1+x22)n+12\forall x\in\mathbb{R},\quad f_{\mathcal{T}(\nu)}(x)=\frac{\Gamma(\tfrac{n+1}{2})}{\sqrt{n\pi}\Gamma(\tfrac{n}{2})}\left(1+\frac{x^2}{2}\right)^{-\tfrac{n+1}{2}}