This Section is devoted to some experimental results. To allow some comparisons, we start by a Gaussian example.
We have simulated
samples from a Gaussian distribution,
with sample size
and parameters
. In
Figure 2, we have plotted the experimental histogram
of the sample variance (circles) together with the theoretical
(scaled) distribution (solid lines). The goodness of fit, as measured
by
, i.e.
is excellent.
A Gaussian curve, even with the required parameters, would not be
the right model (dotted line) since
is far from infinity. Additionally,
the experimental skewness of
is
, i.e.
very close to the theoretical value,
.
When
is the discrete uniform distribution over the integer
range
, the distribution of
remains coarse,
whatever the size
of the simulation. From
together with
, no more than
different values of
can occur. Moreover, this upper bound
is not tight : when
and
the actual number of occurring
values is
, not
. The fact that not every integer is
a square modulo
is one of the reasons of this drastic reduction.
As a result, a batch involving
samples leads to a very
coarse distribution, as shown in Figure 3(a).
The corresponding histogram remains "rugged" as
shown in Figure 3(b). Moreover, it appears that
neither the normalized
(dotted line) nor the adapted
normal curve (solid line) provides even a rough approximation of the
distribution.
Using a continuous uniform distribution leads to better looking experimental
curves as shown in Figure 4
(here again,
). But the departure from
remains
in Figure 4(a) where
while a quite normal curve
is obtained in Figure 4(b) where
.
Many statistics tend to be normally distributed as the data from which they are calculated are increased indefinitely; and this I suggest is the genuine reason for the importance which is universally attached to the normal curve citeseppen-1000cite##1##2(##1@tempswa , ##2)##1##2##3##1 ##3internalcitefisher:toronto24.
In fact, the most surprising curve is the quite Gaussian curve associated
with the uniform distribution. This can be related with the following
fact. The intersection of hyperplane
and the hypercube
is an hyper-polygon. The more
is away from
,
the more this hyper-polygon shortens, leading to small values of
.
Conversely,
is as large as possible.
|
[From an uniform population]
[From a lognormal population]
|