next up previous
Next: Permutation Test Up: Statistical Tests Previous: Kolmogorov-Smirnov Test

Collision Test

The $\chi ^2$ test statistic is meaningful only when each interval has more than 5 samples, but this test is designed such that the number of intervals are much larger than number of samples [8]. Suppose we throw $n$ balls at random into $m$ empty urns with $m \gg n$, then we say a collision occurs if the ball falls into an nonempty urn. Theoretically, the probability that an urn receives exactly $k$ balls is

\begin{displaymath}
p(k) = {n \choose k} (\frac{1}{m})^k (1-\frac{1}{m})^{n-k}
\end{displaymath}

so the expected number of collisions is

\begin{displaymath}
C = \sum_{k>0} (k-1)p(k)
\end{displaymath}

When $m \gg n$, $C \approx \frac{n^2}{m}$. In the experiment, we use $m=2^{20}$ and $n=2^{14}$, so the observed number of collisions should be very close to $\frac{2^{28}}{2^{20}} = 128$.



2001-05-30