17.1.8.2 Algorithms (Normality Test)

Shapiro-Wilk normality test

Given a set of observations X\{x_1,x_2,\ldots x_n\} sorted into either ascending or descending order, the Shapiro Wilk W statistic is defined as:

w=\frac{\left (\sum_{i=1}^n a_ix_i\right)^2}{\sum_{i=1}^n (x_i-\bar{x})^2}

where

\bar{x}=\frac{1}{n}\sum_{1}^n x_i

is the sample mean and ai, for i=1, 2,...n are a set of mathematical weights, the values of which depend only on the sample size n.

The algorithm used by Origin is from the Applied Statistics Algorithm R94 described by Patrick Royston (1995). The function supports sample sizes of 3.

Degree of freedom (DF) is equal to the sample size.

Kolmogorov-Smirnov normality test

Origin calls a NAG function nag_1_sample_ks_test (g08cbc) , to compute the statistics. Please refer to related NAG document, for more details on the algorithm.

Lilliefors normality test

Lilliefors test is adapted from the Kolmogorov-Smirnov test, and the statistics is computed in the same way as that of Kolmogorov-Smirnov test. However, the p-value is different because Lilliefors test does not care about the mean score and variance of the data while Kolmogorov-Smirnov test does. Dallal and Wilkinson (1986) Method is used for p-value computation.

Anderson-Darling Test

Given a set of observations X\{x_1,x_2,\ldots x_n\} sorted into either ascending order, the Anderson Darling statistic is defined as

A^2= - n - S

where

S=\sum_{i=1}^n \frac{2i-1}{n}[lnF(x_i)+ln(1-F(x_n+1-i))]

F is the cumulative distribution function of the F distribution

D'Agostino-K Squared

  • Skewness statistic
    1. Compute the Skewness \sqrt{b_1} from the data
      \sqrt{b_1}= \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^3}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^{3/2}}
    2. Compute
      Y=\sqrt{b_1}[\frac{(n+1)(n+3)}{6(n-2)}]^{1/2}
      \beta_2(\sqrt{b_1})=\frac{3(n^2+27n-70)(n+1)(n+3)}{(n-2)(n+5)(n+7)(n+9)}
      W^2=-1+[2(\beta_2(\sqrt{b_1})-1)]^{1/2}
      \delta=\frac{1}{\sqrt{lnW}}
      \alpha=[\frac{2}{(W^2-1)}]^{1/2}
    3. The Skewness statistic Z(\sqrt{b_1}) can be computed with equation below
      Z(\sqrt{b_1}) = \delta ln(Y/\alpha+[(Y/\alpha)^2+1]^{1/2})
  • Kurtosis Statistic
    1. Compute the Kurtosis b_2 from the data
      b_2 =  \frac{\frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^4}{\left( \frac{1}{n} \sum_{i=1}^n \left( x_i - \bar{x} \right)^2 \right)^2} - 3
    2. Compute the mean and variance of b_2
      E(b_2)=\frac{3(n-1)}{n+1}
      var(b_2)=\frac{24n(n-2)(n-3)}{(n+1)^2(n+3)(n+5)}
    3. Compute the standardized moment of b_2
      \sqrt{\beta_1(b_2)}=\frac{6(n^2-5n+2)}{(n+7)(n+9)}\sqrt{\frac{6(n+3)(n+5)}{n(n-2)(n-3)}}
    4. Compute
      A=6+\frac{8}{\sqrt{\beta_1(b_2)}} [\frac{2}{\sqrt{\beta_1(b_2)}}+\sqrt{1+\frac{4}{\beta_1(b_2)}}]
    5. The Kurtosis statistic Z(b_2) can be computed by formula below
      Z(b_2)=((1-\frac{2}{9A})-[\frac{1-2/A}{1+x\sqrt{2/(A-4)}}]^{1/3})/\sqrt{2/(9A)}
  • D'Agostino's Chi2 Statistic
    K^2 = Z^2(\sqrt{b_1})+Z^2(b_2)

Chen-Shapiro Test

Given a set of observations X\{x_1,x_2,\ldots x_n\} sorted into either ascending order, the Chen-Shapiro statistic is defined as

QH =\sqrt{N}(1-\frac{1}{(n-1)S}\sum_{i=1}^{n-1}\frac{x_{i+1}-x_i}{H_{i+1}-H_i})

where

H_i = \Phi^{-1} ((i-3/8)/(n+1/4)) and \Phi^{-1} is the inverse of teh standard normal distribution