Home
Mac Package - StatPlus:mac
Buy StatPlus
Buy StatPlus:mac
StatPlus 2007 Professional Help Prev Page Prev Page
StatPlus
License agreement
Support
What's New
Getting started
Loading program
Using Keyboard
Entering Data
Editing Data
Statistics
Analyzing Data
Bibliography
Elementary Concepts
Basic Statistics
Descriptive Statistics
Comparing Means
One Sample T-Test
F-Test Two-Sample for Variances
Linear Correlation (Pearson)
Fechner Correlation
Covariance
Normality Tests
Frequency Tables
Cross Tabulation
ANOVA
One-way ANOVA
Two-way and Three-way ANOVA
GLM ANOVA
Latin Squares Analysis
Regression
Linear Regression
Polynomial regression
Stepwise Regression
Binary logistic regression
Cox proportional-hazards regression
Nonparametric statistics
2x2 Tables
Rank Correlations
Comparing two independent samples
Comparing multiple independent samples
Comparing two dependent samples
Comparing multiple dependent samples
Cochran Q Test
Time Series/Forecasting
Autocorrelation and Partial AC
Moving Average
Interrupted Series Analysis
Survival Analysis
Cox proportional-hazards regression
Probit analysis
Charts
Control Charts
Tutorial On Chart Building
Function Reference
All Functions
Math
General
Statistical
Financial
Customizing StatPlus
General
View
Saving
Add-ons
Other
About AnalystSoft

Purpose

    This tool performs a simple analysis of variance, testing the hypothesis that means from two or more samples are equal (drawn from populations with the same mean). This technique expands on the tests for two means, such as the t-test.

Preparations

    Run StatisticsAnalysis of Variance(ANOVA)→One-way ANOVA....

Results

    At the heart of ANOVA is the fact that variances can be divided up, that is, partitioned. Remember that the variance is computed as the sum of squared deviations from the overall mean, divided by N-1 (sample size minus one). Thus, given a certain N, the variance is a function of the sums of (deviation) squares, or SS for short. Partitioning of variance works as follows. Consider the following data set:

 

Group 1

Group 2

Observation 1

2

6

Observation 2

3

7

Observation 3

1

5

Mean

2

6

Sums of Squares (SS)

2

2

Overall Mean

4

Total Sums of Squares

28

The means for the two groups are quite different (2 and 6, respectively). The sums of squares within each group are equal to 2. Adding them together, we get 4. If we now repeat these computations, ignoring group membership, that is, if we compute the total SS based on the overall mean, we get the number 28. In other words, computing the variance (sums of squares) based on the within-group variability yields a much smaller estimate of variance than computing it based on the total variability (the overall mean). The reason for this in the above example is of course that there is a large difference between means, and it is this difference that accounts for the difference in the SS. In fact, if we were to perform an ANOVA on the above data, we would get the following result:

 

MAIN EFFECT

SS

 df 

MS

F

p

Effect

24.0

1

24.0

24.0

.008

Error

4.0

4

1.0

 

 

As you can see, in the above table the total SS (28) was partitioned into the SS due to within-group variability (2+2=4; see the second row of the spreadsheet) and variability due to differences between means (28-(2+2)=24; see the first row of the spreadsheet).

SS Error and SS Effect.

The within-group variability (SS) is usually referred to as Error variance. This term denotes the fact that we cannot readily explain or account for it in the current design. However, the SS Effect we can explain. Namely, it is due to the differences in means between the groups. Put another way, group membership explains this variability because we know that it is due to the differences in means.

Significance testing.

The basic idea of statistical significance testing is discussed in Elementary Concepts. Elementary concepts also explains why very many statistical tests represent ratios of explained to unexplained variability. ANOVA is a good example of this. Here, we base this test on a comparison of the variance due to the between-groups variability (called Mean Square Effect, or MSeffect) with the within-group variability (called Mean Square Error, or MSerror; this term was first used by Edgeworth, 1885). Under the null hypothesis (that there are no mean differences between groups in the population), we would still expect some minor random fluctuation in the means for the two groups when taking small samples (as in our example). Therefore, under the null hypothesis, the variance estimated based on within-group variability should be about the same as the variance due to between-groups variability. We can compare those two estimates of variance via the F-test, which tests whether the ratio of the two variance estimates is significantly greater than 1. In our example above, that test is highly significant, and we would in fact conclude that the means for the two groups are significantly different from each other.

Summary of the basic logic of ANOVA.

To summarize the discussion up to this point, the purpose of analysis of variance is to test differences in means (for groups or variables) for statistical significance. This is accomplished by analyzing the variance, that is, by partitioning the total variance into the component that is due to true random error (i.e., within-group SS) and the components that are due to differences between means. These latter variance components are then tested for statistical significance, and, if significant, we reject the null hypothesis of no differences between means, and accept the alternative hypothesis that the means (in the population) are different from each other.