 |
Charts |
|
|
Purpose
This procedure
tests the hypothesis that the data come from the normal distribution. See.
Why the "normal distribution" is
important. Preparations
To run this procedure, select a range, and then run the Statistics→Basic
Statistics and Tables→Normality Tests
command. Results
Count - analyzed sample size.
Mean -
analyzed sample mean. See
Elementary Concepts.
Standard Deviation, Median, Skewness, Kurtosis - See
Elementary Concepts.
Kolmogorov-Smirnov/Lilliefor Test.
The Kolmogorov-Smirnov one-sample test for normality is based on the maximum
difference between the sample cumulative distribution and the hypothesized
cumulative distribution. If the D statistic is significant, then the hypothesis
that the respective distribution is normal should be rejected. For many software
programs, the probability values that are reported are based on those tabulated
by Massey (1951); those probability values are valid when the mean and standard
deviation of the normal distribution are known a-priori and not estimated from
the data. However, usually those parameters are computed from the actual data.
In that case, the test for normality involves a complex conditional hypothesis
("how likely is it to obtain a D statistic of this magnitude or greater,
contingent upon the mean and standard deviation computed from the data"), and
the Lilliefors probabilities should be interpreted (Lilliefors, 1967). Note that
in recent years, the Shapiro-Wilks W test has become the preferred test of
normality because of its good power properties as compared to a wide range of
alternative tests.
Shapiro-Wilk W Test.
The Shapiro-Wilk W test is used in testing for normality. If the W statistic
is significant, then the hypothesis that the respective distribution is normal
should be rejected. The Shapiro-Wilk W test is the preferred test of normality
because of its good power properties as compared to a wide range of alternative
tests (Shapiro, Wilk, & Chen, 1968). W Statistics in computed
as
W = b2 /
S2
where
S2 = (xi-µ)2
b = an-i+1(xn-i+1-xi)
where
µ -
mean
an-i+1 contants
Hence, the closer W is to one, the
more normal the sample is. The probability values for W are valid for samples in
the range of 3 to 5000. W may not be as powerful as other tests when ties occur
in your data.
D'Agostino
Tests.
D'Agostino (1990) describes a normality test based on the skewness
coefficient, . Recall that because the normal distribution is symmetrical, is
equal to zero for normal data. Hence, a test can be developed to determine if
the value of is significantly different from zero. If it is, the data are
obviously nonnormal. The statistic, z2s,
is, under the null hypothesis of normality, approximately normally distributed.
Also D'Agostino (1990) describes a
normality test based on the kurtosis coefficient. Recall that for the normal
distribution, the theoretical value of kurtosis coefficient is 3. Hence, a test
can be developed to determine if the value of kurtosis coefficient is
significantly different from 3. If it is, the data are obviously nonnormal. The
statistic, z2k,
is, under the null hypothesis of normality, approximately normally distributed
for sample sizes n>20.
D'Agostino (1990) describes a normality test that combines the tests for
skewness and kurtosis. K2 The statistic,
K2 (K2
= z2s + z2k)
, is approximately distributed as a chi-square with two degrees of freedom.
|