Home
StatPlus 2007 Professional Help Prev Page Prev Page
StatPlus
License agreement
Support
What's New
Getting started
Loading program
Using Keyboard
Entering Data
Editing Data
Statistics
Analyzing Data
Bibliography
Elementary Concepts
Basic Statistics
Descriptive Statistics
Comparing Means
One Sample T-Test
F-Test Two-Sample for Variances
Linear Correlation (Pearson)
Fechner Correlation
Covariance
Normality Tests
Frequency Tables
Cross Tabulation
ANOVA
One-way ANOVA
Two-way and Three-way ANOVA
GLM ANOVA
Latin Squares Analysis
Regression
Linear Regression
Polynomial regression
Stepwise Regression
Binary logistic regression
Cox proportional-hazards regression
Nonparametric statistics
2x2 Tables
Rank Correlations
Comparing two independent samples
Comparing multiple independent samples
Comparing two dependent samples
Comparing multiple dependent samples
Cochran Q Test
Time Series/Forecasting
Autocorrelation and Partial AC
Moving Average
Interrupted Series Analysis
Survival Analysis
Cox proportional-hazards regression
Probit analysis
Charts
Control Charts
Tutorial On Chart Building
Function Reference
All Functions
Math
General
Statistical
Financial
Customizing StatPlus
General
View
Saving
Add-ons
Other
About AnalystSoft

Purpose

   By entering frequencies into a 2 x 2 table, you can calculate various statistics to evaluate the relationship between two dichotomous variables. Thus, the 2 x 2 option can be used as an alternative to correlation when the two variables of interest are dichotomous.

Preparations

    Select a 2x2 cells range and run Statistics→Nonparametric Statistics →2x2 Tables command.

Results

The Pearson Chi-square is the most common test for significance of the relationship between categorical variables. This measure is based on the fact that we can compute the expected frequencies in a two-way table (i.e., frequencies that we would expect if there was no relationship between the variables). For example, suppose we ask 20 males and 20 females to choose between two brands of soda pop (brands A and B). If there is no relationship between preference and gender, then we would expect about an equal number of choices of brand A and brand B for each sex. The Chi-square test becomes increasingly significant as the numbers deviate further from this expected pattern; that is, the more this pattern of choices for males and females differs.
    The value of the Chi-square and its significance level depends on the overall number of observations and the number of cells in the table. Consistent with the principles discussed in Elementary concepts, relatively small deviations of the relative frequencies across cells from the expected pattern will prove significant if the number of observations is large.
    The only assumption underlying the use of the Chi-square (other than random selection of the sample) is that the expected frequencies are not very small. The reason is that the Chi-square inherently tests the underlying probabilities in each cell; and when the expected cell frequencies fall, for example, below 5, those probabilities cannot be estimated with sufficient precision. For further discussion of this issue refer to Everitt (1977), Hays (1988), or Kendall and Stuart (1979).

Yates corrected Chi-square. The approximation of the Chi-square statistic in small 2 x 2 tables can be improved by reducing the absolute value of differences between expected and observed frequencies by 0.5 before squaring (Yates' correction). This correction, which makes the estimation more conservative, is usually applied when the table contains only small observed frequencies, so that some expected frequencies become less than 10 (for further discussion of this correction, see Conover, 1974; Everitt, 1977; Hays, 1988; Kendall & Stuart, 1979; and Mantel, 1974).

Phi-square. The Phi-square is a measure of correlation between the two categorical variables in the table.

Fisher exact test. Given the marginal frequencies in the table, and assuming that in the population the two factors in the table are not related, how likely is it to obtain cell frequencies as uneven or worse than the ones that were observed? For small n, this probability can be computed exactly by counting all possible tables that can be constructed based on the marginal frequencies. This is the underlying rationale for the Fisher exact test. It computes the exact probability under the null hypothesis of obtaining the current distribution of frequencies across cells, or one that is more uneven.