In this paper, no questions about random uniform generators will be
asked. The question of the quality of such generators is the subject
of a huge amount of literature [2,3] and, for what
we are doing, a congruential generator like the Maple's one, described
ALG. 1, can be considered as safe. As it
should be, the modulus
is prime, while the multiplier
is a primitive root for that modulus. To shorten notations,
rand(1..n)() will denote a generator providing integers uniformly
distributed inside
,
implemented as
, and
randu() will denote an uniform generator over
,
implemented as
.
To describe an unfair die, we have to chose the number of faces (say
) and the probabilities for each face. To avoid "poorly
chosen rational numbers" (the key point of Section 4),
the odds for the different faces have been chosen (once for ever)
proportional to numbers obtained as
,
i.e. not equal, but not too different. The simulation of what happens
when rolling that die can be obtained by a rejection scheme, as described
in ALG. 2 where the rejection thresholds
have been defined by
.
Roughly speaking, the last digit of
is used to make a
proposal for
and the other digits are used to approve or
reject that proposal. A more "paranoïd" implementation
will use a call to rand() for the proposal, and another call
for the decision (doubling the cost).
A more efficient implementation will use the Walker scheme as described
in [1], dividing the cost by the rejection ratio, here
about
. Moreover, this implementation uses exactly one call
to the congruential generator by die throw, allowing an easier parallelization
of the computations. To undertake
die
throws by using
computers, you can start with a given
on the first one, together with
on the second one,
on the third and so on.
Let us call sample the result of a fixed number
of die throws, i.e. an ordered set of natural integers
each
denoting how many times the face labeled
has appeared. Obviously,
,
while the expected values
of these
are given by
.
We have chosen
and the
given
FIG. 2 to exemplify our paper. according to these
odds, the random variable "a die throw" has expectation,
variance and fourth momentum given by :
FIG. 3 examplifies such a sample and the statistics
that can be extracted from it. Namely the mean value of the die scores,
i.e.
, the variance predictor,
i.e.
and the
of the experiment, i.e.
.
These three statistics can be considered as three new random variables,
having their own distribution (over the set of all possible samples).
The dispersion parameters of
are well-known to be given
by :
).
We have the following table :
In order to examine the effective distribution of these statistics,
it is convenient to continue to roll the die and obtain what will
be referred as a day, i.e. a collection of a given number (say
) of a samples. FIG. 4
displays the histograms of the values obtained for these statistics
during the
samples of a given day, as well
as the usual theoretical models for their repartition.
The compatibility of these models (normal, normal and
)
with the experimental results can be eye-checked in FIG. 4,
and also tested in respect to their mean, variance and
behavior. The results of these tests are summarized Table 2,
where
is the expectation,
the experimental mean,
the variance and
the experimental predictor of variance. Nothing but expected happens
: the central limit theorem is acting over
, and chi-square
is chi-square distributed !
The positive result of the preceding paragraph can be enforced, by
a careful examination a succession of days... a month. Let us formalize
the computation leading to number
downright Table 2.
The number of bars of the histogram will be fixed to
,
and the range for
divided into quite equiprobable sub-intervals.
Here again, "poorly chosen rational numbers" are
avoided by choosing randomly the odds of these intervals (formula
4+randu() has been used to obtain the probabilities listed
in Table 3). Thereafter, the inverse
cdf (with
df) is used to obtain the boundaries of each
interval.
For each day, the
instantiations of
are distributed in the intervals,
of them falling in
and so on, finishing with
of them falling into
. The
new random variable
is defined as the chisquare
of these
05