Compare Multiple Independent Samples

The command compares multiple independent samples using the Kruskal-Wallis test (Kruskal-Wallis ANOVA by Ranks) and Mood's median test.

Kruskal-Wallis test is the nonparametric alternative to one-way between groups ANOVA. The test assumes that each of the samples (treatment conditions) contains at least five observations (scores). The usage of Kruskal-Wallis test is preferable when there are three or more conditions (samples) that need to be compared and each condition is performed by a different group of participants, or the assumptions for a parametric test are not met.

The median test is a special case of the chi-square test for independence. The test has poor statistical power for samples drawn from normal or short-tailed distributions, but it is particularly effective in detecting shift in location for symmetric and heavy-tailed distributions.

# How To

For unstacked data (each column is a sample):

o Run the Statistics→Nonparametric Statistics → Compare Multiple Independent Samples command.

o Select variables to compare.

For stacked data:

o Run the Statistics→Nonparametric Statistics → Compare Multiple Independent Samples (*with group variable*)
command.

o Select a variable with observations (Variable) and a text or numeric variable with group names (Groups).

# Results

The report includes Kruskal-Wallis ANOVA and median test results.

Kruskal-Wallis Test is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test when there are more than two groups.

The null hypothesis is that samples come from
populations with identical locations. When it is assumed that the shape of the
distribution for each sample is the same, the null hypothesis can be stated as:
the population medians of all samples are equal (alternative hypothesis H_{1}:
at least one population median of one sample is different from the population
median of at least one other sample).

Kruskal-Wallis test statistic **H**
is defined as:

where is
the total number of observations across all samples, *k* is the number of
samples, *n _{i}* is the number of observations in the i

^{th}sample,

*r*is the rank of

_{i}*i*observation. The test statistic H is approximately chi-square distributed with

^{th}*k - 1*degrees of freedom.

If the p-level ≤ α (default value – 0.05)
the null hypothesis is rejected and the alternative hypothesis, that the
differences between some of the medians are statistically significant, is
accepted (*assuming the same shape of the distribution*).

The median test is
an extension of the paired sign test and is similar to the sign test in terms
of robustness (robust against outliers) and power (generally low). The null
hypothesis H_{0} – samples come from the same distribution; when the
same distribution is assumed for populations, H_{0}: populations have
the same median.

The test statistic is a chi-squared statistic for
a *2 x** k *contingency table (the
rows contains the number of observations above and below the grand median for
each sample):

where *k* is the number of samples*, a*
is the number of observations greater than the grand median for all samples, *b*
is the number of observations less than or equal to the median for all samples,
*N* is the total number of observations across all samples, is
the number of observations greater than the median for the *i ^{th}*
sample.

The test results are interpreted in
the same way as the Kruskal-Wallis test results: if the p-level is less than or
equal to α (default value – 0.05) then the alternative hypothesis is
accepted – at least two medians are different (*assuming the same shape of
the distribution*).

# References

Chalmer, B. (1986). Understanding Statistics Published by CRC Press. Nearfine Books (Brooklyn, NY, U.S.A.)

Corder, G.W. & Foreman, D.I. (2014). Nonparametric Statistics: A Step-by-Step Approach, Wiley.

Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47 № 260. — pp. 583–621

Siegel, S. (1956). Nonparametric statistics: For the behavioral sciences. New York: McGraw-Hill, 1956.