Backward Stepwise Regression

Backward Stepwise Regression is a stepwise regression approach that begins with a full (saturated) model and at each step gradually eliminates variables from the regression model to find a reduced model that best explains the data. Also known as Backward Elimination regression.

The stepwise approach is useful because it reduces the number of predictors, reducing the multicollinearity problem and it is one of the ways to resolve the overfitting.

How To

Run: Statistics→Regression → Backward Stepwise Regression...

Select the dependent variable (Response) and independent variables (Predictors).

Remove if alpha > option defines the  Alpha-to-Remove value. At each step it is used to select candidate variables for elimination – variables, whose partial F p-value is greater or equal to the alpha-to-remove. The default  value is 0.10.

Select the Show correlations option to include the correlation coefficients matrix to the report.

Select the Show descriptive statistics option to include the mean, variance and standard deviation of each term to the report.

Select the Show results for each step option to show the regression model and summary statistics for each step.


The report shows regression statistics for the final regression model. If the Show results for each step option is selected, the regression model, fit statistics and partial correlations are displayed at each removal step. Correlation coefficients matrix and descriptive statistics for predictors are displayed if the corresponding options are selected.

The command removes predictors from the model in a stepwise manner. It starts from the full model with all variables added, at each step the predictor with the largest p-value (that is over the alpha-to-remove) is being eliminated. When all remaining variables meet the criterion to stay in the model, the backward elimination process stops.

R2 (Coefficient of determination, R-squared) - is the square of the sample correlation coefficient between the Predictors (independent variables) and Response (dependent variable). In general, R2 is a percentage of response variable variation that is explained by its relationship with one or more predictor variables. In simple words R2 indicates the accuracy of the prediction. The larger R2 is, the more the total variation of Response is reduced by introducing the predictor variable. The definition of the R2 is

Adjusted R2 (Adjusted R-squared) - is a modification of R2 that adjusts for the number of explanatory terms in a model. While R2 increases when extra explanatory variables are added to the model, the adjusted R2 increases only if the added term is a relevant one. It could be useful for comparing the models with different numbers of predictors. Adjusted R2 is computed using the formula
 where k is the number of predictors.

S – the estimated standard deviation of the error in the model.

MS (Mean Square) - the estimate of the variation accounted for by this term.

F - the F-test value for the model.

p-level - the significance level of the F-test. A value less than  (0.05) shows that the model estimated by the regression procedure is significant.

VIF – variance inflation factor, measures the inflation in the variances of the parameter estimates due to collinearities among the predictors. It is used to detect multicollinearity problems.

TOL - the tolerance value for the parameter estimates, it is defined as TOL = 1 / VIF.

Partial Correlations are correlations between each predictor and the outcome variable excluding the effect of other variables.


[HRA] Hocking, R. R. (1976) "The Analysis and Selection of Variables in Linear Regression," Biometrics, 32

[NWK] Neter, J., Wasserman, W. and Kutner, M. H. (1996). Applied Linear Statistical Models, Irwin, Chicago.