17.1.9.2 Algorithms (Normality Test)

1 Shapiro-Wilk normality test
2 Kolmogorov-Smirnov normality test
3 Lilliefors normality test
4 Anderson-Darling Test
5 D'Agostino-K Squared
6 Chen-Shapiro Test

Shapiro-Wilk normality test

Given a set of observations $X\{x_1,x_2,\ldots x_n\}$ sorted into either ascending or descending order, the Shapiro Wilk W statistic is defined as:

$w=\frac{\left (\sum_{i=1}^n a_ix_i\right)^2}{\sum_{i=1}^n (x_i-\bar{x})^2}$

where

$\bar{x}=\frac{1}{n}\sum_{1}^n x_i$

is the sample mean and ai, for i=1, 2,...n are a set of mathematical weights, the values of which depend only on the sample size n.

The algorithm used by Origin is from the Applied Statistics Algorithm R94 described by Patrick Royston (1995). The function supports sample sizes of 3.

Degree of freedom (DF) is equal to the sample size.

Kolmogorov-Smirnov normality test

Origin calls a NAG function nag_1_sample_ks_test (g08cbc) , to compute the statistics. Please refer to related NAG document, for more details on the algorithm.

Lilliefors normality test

Lilliefors test is adapted from the Kolmogorov-Smirnov test, and the statistics is computed in the same way as that of Kolmogorov-Smirnov test. However, the p-value is different because Lilliefors test does not care about the mean score and variance of the data while Kolmogorov-Smirnov test does. Dallal and Wilkinson (1986) Method is used for p-value computation.

Anderson-Darling Test

Given a set of observations $X\{x_1,x_2,\ldots x_n\}$ sorted into either ascending order, the Anderson Darling statistic is defined as

$A^2= - n - S$

where

$S=\sum_{i=1}^n \frac{2i-1}{n}[lnF(x_i)+ln(1-F(x_n+1-i))]$

$F$ is the cumulative distribution function of the $F$ distribution

D'Agostino-K Squared

Skewness statistic
1. Compute the Skewness $\sqrt{b_1}$ from the data
  $\sqrt{b_1}= \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^3}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^{3/2}}$
2. Compute
  $Y=\sqrt{b_1}[\frac{(n+1)(n+3)}{6(n-2)}]^{1/2}$
  
  $\beta_2(\sqrt{b_1})=\frac{3(n^2+27n-70)(n+1)(n+3)}{(n-2)(n+5)(n+7)(n+9)}$
  
  $W^2=-1+[2(\beta_2(\sqrt{b_1})-1)]^{1/2}$
  
  $\delta=\frac{1}{\sqrt{lnW}}$
  
  $\alpha=[\frac{2}{(W^2-1)}]^{1/2}$
3. The Skewness statistic $Z(\sqrt{b_1})$ can be computed with equation below
  $Z(\sqrt{b_1}) = \delta ln(Y/\alpha+[(Y/\alpha)^2+1]^{1/2})$

Kurtosis Statistic
1. Compute the Kurtosis $b_2$ from the data
  $b_2 = \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^4}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^2} - 3$
2. Compute the mean and variance of $b_2$
  $E(b_2)=\frac{3(n-1)}{n+1}$
  
  $var(b_2)=\frac{24n(n-2)(n-3)}{(n+1)^2(n+3)(n+5)}$
3. Compute the standardized moment of $b_2$
  $\sqrt{\beta_1(b_2)}=\frac{6(n^2-5n+2)}{(n+7)(n+9)}\sqrt{\frac{6(n+3)(n+5)}{n(n-2)(n-3)}}$
4. Compute
  $A=6+\frac{8}{\sqrt{\beta_1(b_2)}} [\frac{2}{\sqrt{\beta_1(b_2)}}+\sqrt{1+\frac{4}{\beta_1(b_2)}}]$
5. The Kurtosis statistic $Z(b_2)$ can be computed by formula below
  $Z(b_2)=((1-\frac{2}{9A})-[\frac{1-2/A}{1+x\sqrt{2/(A-4)}}]^{1/3})/\sqrt{2/(9A)}$