Let
be
independent
instantiations of a random variable with variance
and fourth centered moment
. Define the new r.v.
(aka the variance estimator obtained from the given sample) by
.
Then, taken over the set of all possible samples of a given size
,
we have :
A key point is that
is invariant by the translation
where
denotes some constant :
.
Therefore, nothing is changed if the r.v.
are assumed
to be centered.
Now, using the theory of homogenous symmetrical polynomials, it can
be shown that not only
can be expressed as a polynomial
expression into the elementary polynomials
and
but also
as a linear combination of
and
, i.e. quantities whose expectations are obvious,
namely
and
. Therefore, we have
,
where
and
is some polynomial whose degree is
at most
. Such a polynomial can be found by identification.
A direct computation shows that
takes values
when
takes values
. Therefore
is proven for all
.
The second formula can be obtained by the same method. The quantity
is a degree
homogenous symmetrical polynomial in the
's, and therefore
a linear combination of
,
,
,
and
. Thus
is
a linear combination of the expectations of
and
, the other three having
as expectation. Thus
,
the coefficients
and
being rational fractions in
having the form
where the degree of
is at most
.
Here again, these coefficients can be found by identification. A direct
computation with
gives
The pdf of
where
the
are independent Gaussian random variables is
For
, this result comes from
where the factor
is required since the correspondence
is not single valued. For greater values of
, this result
comes from the convolution formula
.
We have
.
Changing
into
where
,
we obtain
as required.
The value of
comes from the very definition of the
Gamma function, and we have
.
Therefore
,
leading to
. Moreover, the modal value of
a
variable is the solution of
,
namely
.
The
has been defined in EQ. eq: def_pearson.
Exemplifying with
, we obtain
In this expression, quantities
and
are to be underlined since they have a effective meaning. Obtaining
a multiset
according to the multinomial law, i.e. with probability
can be done by successive binomial trials. At the begining,
decisions are to be taken, and the probability to go to state
is
. Therefore,
can be sampled
according to the binomial law
.
After what,
"decisions" have been taken
and
"lacks of decision" are
to be replayed. Since state
is now exclued, the probability
of state
shifts to
,
and
can be to sampled according to the binomial law
.
And so on, leading to :
When
, the quantity
defined by :
The exact result
The following formula holds exactly :