next up previous
Next: Collision Test Up: Statistical Tests Previous: Test

Kolmogorov-Smirnov Test

The main problem with $\chi ^2$ test is the choice of number and size of the intervals. Although rules of thumb can help produce good results (for example, the range should be divided such that $\mbox{EXP}_i \ge 5$ for all $i$), there is no panacea for all kinds of applications [10]. Another problem is that the $\chi ^2$ test is designed for discrete distributions, so in continuous case the $\chi ^2$ test statistic is only an approximation [7].

Kolmogorov-Smirnov (KS) test is designed to address above issues. Given the hypothesized continuous distribution function $F$ without jumps, this test compares $F$ to the empirical distribution function, $F'$, of the samples. The KS test statistic $D$ is the largest absolute deviation between $F(x)$ and $F'(x)$ over the range of the random variable:

\begin{displaymath}
D = \max_x\{ \vert F'(x)-F(x)\vert \}
\end{displaymath}

$F'(x)$ is defined as

\begin{displaymath}
F'(x) = \frac{\mbox{number of samples } \le x}{N}
\end{displaymath}

where $N$ is the number of samples. For testing against a uniform distribution, we must first sort the samples into ascending order $U_1 \le U_2 \le \cdots \le U_N$, ( $0\le U_i \le 1$ for all $i$) then computer the following statistics

\begin{displaymath}
D^+ = \max_{1 \le i \le N} \left\{ \left\vert\frac{i}{N}-U_i \right\vert \right\}
\end{displaymath}


\begin{displaymath}
D^- = \max_{1 \le i \le N} \left\{ \left\vert U_i - \frac{i-1}{N} \right\vert \right\}
\end{displaymath}

Then $D= \max(D^+, D^-)$. To assess $D$, we use the hypothesis test as mentioned in Section 4.1. $H_0$ will be rejected at significance level $\alpha$ if

\begin{displaymath}
\left( \sqrt{n} + 0.12 + \frac{0.11}{\sqrt{n}} \right) > c_{1-\alpha}
\end{displaymath}

where values of $c_{1-\alpha}$ are given by the following table:

\begin{displaymath}
\begin{array}{rlllll}
1 - \alpha & 0.850 & 0.900 & 0.950 & 0...
...\alpha} & 1.138 & 1.224 & 1.358 & 1.480 & 1.628 \\
\end{array}\end{displaymath}

Since KS-test does not group samples into categories, it is more sensitive to outliers. In this sense, KS test makes better use of each sample and is more precise than the $\chi ^2$ test.


next up previous
Next: Collision Test Up: Statistical Tests Previous: Test
2001-05-30