previous up next
Previous: 2 NOTATIONS Up: SAMPLING DISTRIBUTION OF THE Next: 4 EXPERIMENTING

Subsections


3 RESULTS IN CLOSED FORM

Well Known Result 3.1   Considered from the $ \Phi $ point of view, $ m$ and $ m_{2}$ are random variables and we have :

$\displaystyle E_{\Phi }\left(m\right)=\mu ,\quad var_{\Phi }\left(m\right)=\frac{1}{n}\,\mu_{2}$ (3)
$\displaystyle E_{\Phi }\left(m_{2}\right)=\mu_{2},\quad var_{\Phi }\left(m_{2}\right)=\frac{1}{n}\,\left(\mu_{4}-\mu_{2}^{2}+\frac{2}{n-1}\,\mu_{2}^{2}\right)$ (4)

Remark   Formula (4) is attributed to citeseppen-1000cite##1##2(##1@tempswa , ##2)##1##2##3##1 ##3internalcitefisher:moments29 by citeseppen-1000cite##1##2(##1@tempswa , ##2)##1##2##3##1 ##3internalciteweatherburn:course and to citeseppen-1000cite##1##2(##1@tempswa , ##2)##1##2##3##1 ##3internalcitestudent:error-mean by citeseppen-1000cite##1##2(##1@tempswa , ##2)##1##2##3##1 ##3internalcitefisher:moments29 himself. Many proofs can be given, among them Algorithm 5.2 zalg:closed-form.

1 Closed Forms when $ n<4$

Theorem 3.2   Let $ \varphi $ be the p.d.f. of $ \xi\in\Omega $. Then, for $ n=2,\,3$, we have the following closed forms for the p.d.f. of $ m_{2}$ :
$\displaystyle pd\! f_{2}\left(m_{2}\right)$ $\displaystyle =$ $\displaystyle \sqrt{\frac{2}{m_{2}}}\,\int_{\mathbb{R}}\varphi \left(t\right)\,\varphi \left(t+\sqrt{2\,m_{2}}\right)\,\mathrm{d}t$  
$\displaystyle pd\! f_{3}\left(m_{2}\right)$ $\displaystyle =$ $\displaystyle \int_{t=0}^{t=s}\frac{4\sqrt{3}}{\sqrt{m_{2}-t^{2}}}\int_{\mathbb...
...u+t\right)\varphi \left(u+\sqrt{3m_{2}-3t^{2}}\right)\,\mathrm{d}u\,\mathrm{d}t$  

Proof. Concerning $ n=2$, start from $ 1=\iint\varphi \left(x\right)\varphi \left(y\right)\,\mathrm{d}x\,\mathrm{d}y$ and use $ t=x,\,m_{2}=\left(x^{2}-2xy+y^{2}\right)/2$ whose Jacobian is $ J=y-x$. Chose the branch $ y=t+\sqrt{2\,m_{2}}$, $ J=\sqrt{2\,m_{2}}$ and compute $ \iint\varphi \left(t\right)\varphi \left(t+\sqrt{2\,m_{2}}\right)/J\:\mathrm{d}t\,\mathrm{d}m_{2}$. Since both branches have equal contributions for a given $ m_{2}$, $ \iint=1/2$ and $ pd\! f_{2}$ follows. Concerning $ n=3$, start from $ 1=\iiint\varphi \left(x\right)\varphi \left(y\right)\varphi \left(z\right)\,\mathrm{d}x\,\mathrm{d}y\,\mathrm{d}z$ and use $ t=(x-y)/2,\, u=\left(x+y\right)/2,\,m_{2}=\left(x^{2}-xy+y^{2}-xz+z^{2}-yz\right)/3$ whose Jacobian is $ J=\left(2z-x-y\right)/3$. Chose the branch $ z=u+\sqrt{3\,m_{2}-3t^{2}}$, $ J=\sqrt{m_{2}-t^{2}}/\sqrt{3}$ and compute $ \iiint\varphi \left(u-t\right)\varphi \left(u+t\right)\varphi \left(t+\sqrt{3\,m_{2}-3t^{2}}\right)/J\:\mathrm{d}u\,\mathrm{d}t\,\mathrm{d}m_{2}$. Here again, a factor $ 2$ appears to take both branches into account, and an extra factor $ 2$ appears when using symmetry to restrict the integration domain to $ x\geq y$ i.e. to $ t\geq0$. $ \qedsymbol$

Remark   It can be checked that, applied to a normal variable, Theorem 3.2 leads to a $ \chi^{2}$ distribution (special cases of xtwnr 3.4).

Theorem 3.3   Let $ \xi\in\left[-a,\,+a\right]$ be an uniform (continuous) random variable and $ n=3$ (the sample size). Then $ s^{2}=m_{2}\in\left[0,\,4a^{2}/3\right]$ with the following $ pd\! f$ :

\begin{displaymath}\begin{cases}pd\! f\left(m_{2}\right)=\frac{3\sqrt{3}}{a^{2}}...
...{s}{2\, a}+\sqrt{\frac{s^{2}}{a^{2}}-1}\right) & a<s\end{cases}\end{displaymath} (5)

Proof. While integrating over $ u\in\mathbb{R}$ in Theorem 3.2, the three factors product vanishes unless $ u_{1}\leq u\leq u_{2}$ where
$\displaystyle u_{1}$ $\displaystyle =$ $\displaystyle \max\left(-a-t,\,-a+t,\,-a-W\right)=t-a$  
$\displaystyle u_{2}$ $\displaystyle =$ $\displaystyle \min\left(a-t,\, a+t,\, a-W\right)=\min\left(a-t,\, a-W\right)$  

and $ \sqrt{3m_{2}-3t^{2}}$ has been shortened into $ W$. In Figure 1, the discussion is drawn in the $ \left(m_{2},\, t\right)$ plane. Zone A is characterized by $ u_{2}=a-W$ and zone B by $ u_{2}=a-t$, separated by the line $ t=W$ i.e. $ m_{2}=4\, t^{2}/3$. In order to enforce condition $ u_{1}\leq u_{2}$, zone B is bounded by $ t=a$ and zone A by $ m_{2}=a^{2}+(2\, t-a)^{2}/3.$

The inner integral evaluates to $ 3\left(2a-t-W\right)/\left(2W\, a^{3}\right)$ when $ \left(m_{2},\, t\right)\in A$, to $ 3\left(a-t\right)/\left(W\, a^{3}\right)$ when $ \left(m_{2},\, t\right)\in B$ and to 0 otherwise. Therefore, the outer integral has to be split into $ t\in\left[0,\,s\,\sqrt{3}/2\right]$ and $ t\in\left[s\,\sqrt{3}/2,\,s\right]$ when $ s<a$ (the left dotted line) and split into $ t\in\left[0,\, t_{1}\right]$, $ t\in\left[t_{2},\,s\,\sqrt{3}/2\right]$ and $ t\in\left[s\,\sqrt{3}/2,\, a\right]$ when $ s>a$ (the right dotted line). The rest of the computation is straightforward. It can be checked that (5) lead to $ E\left(1\right)=1$, $ E_{\Phi }\left(m_{2}\right)=\mu_{2}=a^{2}/3$ and $ var_{\Phi }\left(m_{2}\right)=a^{4}/15$ as given by (4). $ \qedsymbol$

Figure 1: Graphical discussion of Theorem 3.3
% latex2html id marker 5101
\includegraphics[width=0.7\textwidth,height=52mm]{figures/xfig_zones}

Remark   The fact that $ pd\! f\left(m_{2}\right)$ has a so complicated form, even for $ n=3$ and a so simple $ \varphi $ is another indication of the complexity of the question to solve.

2 Normal Distribution

Well Known Result 3.4 (citeseppen-1000cite##1##2##1@tempswa , ##2##1##2##3##1 ##3internalcitelukas:charac42)   Random variates $ m$ and $ m_{2}$ are fully independent if and only if the sampled population $ \Omega $ is normal. In such a case, $ \left(n-1\right)m_{2}/\mu_{2}$ is $ \chi_{n-1}^{2}$ distributed.

Remark 3.5   Most of the time, xtwnr 3.4 appears in the "Gaussian distribution" chapter of statistics books and is not recalled in the "$ \chi^{2}$" chapter. It should be emphasized that Gaussian distribution is not the paradigm but the exception when dealing with sample variance : the Gaussian distribution is the sole and only distribution such that sample mean and sample variance are independent. Therefore, the $ \chi^{2}$ model cannot even be applied as an approximate model for the sample variance relative to any non Gaussian distribution.

Remark 3.6   In the rest of the paper, non normal distributions of $ \xi $ will be considered. In order to facilitate comparisons between the induced distributions of the sample variance, it is of interest to compare their scaled squared coefficients of variation (sscv). From xtwnr 3.4, the reference value of the sscv is :

$\displaystyle sscv_{norm}=\frac{var_{\Phi }\left(s^{2}\right)}{\sigma ^{2}}\tim...
...rac{n-1}{\sigma ^{2}}=\frac{var\left(\chi^{2}\right)}{E\left(\chi^{2}\right)}=2$ (6)


previous up next
Previous: 2 NOTATIONS Up: SAMPLING DISTRIBUTION OF THE Next: 4 EXPERIMENTING


douillet@ensait.fr
2009-09-09