Forward Stepwise Regression

Forward Stepwise Regression is a stepwise regression approach that starts from the null model and adds a variable that improves the model the most, one at a time, until the stopping criterion is met. The criterion for predictor entry into the model is based on the F-statistic and corresponding p-value (p-value must be less than the Alpha-to-Enter). It is also known as Forward Selection regression.

How To

Run: Statistics→Regression → Forward Stepwise Regression...

Select the dependent variable (Response) and independent variables (Predictors).

Enter if alpha < option defines the Alpha-to-Enter value. At each step it is used to select candidate variables for entry, with partial F p-value less or equal to the alpha-to-enter. The default value is 0.05.

o   Select the Show correlations option to include the correlation coefficients matrix to the report.

o   Select the Show descriptive statistics option to display the mean, variance and standard deviation of each term.

o   Select the Show results for each step option to show the regression model and summary statistics for each step.


The criterion to enter a variable at each step is the partial F p-value (that is over the alpha-to-enter). When none of the unselected variables meet the entry criterion, the forward selection command terminates the process. Forward stepwise algorithm is greedy version of the best subsets regression, so it may not result with the best model (model with lowest SSE). It is sensitive to choice of the alpha-to-entry value.


The report shows regression statistics for the final regression model. If the Show result for each step option is selected, the regression model, fit statistics and partial correlations are displayed for all variables entered at a selection step. A correlation coefficients matrix and descriptive statistics for predictors are included to the report if the corresponding options are selected.


R2 (Coefficient of determination, R-squared) - is the square of the sample correlation coefficient between the Predictors (independent variables) and Response (dependent variable).

Adjusted R2 (Adjusted R-squared) - is a modification of R2 that adjusts for the number of explanatory terms in a model. While R2 increases when extra explanatory variables are added to the model, the adjusted R2 increases only if the added term is a relevant one. It could be useful for comparing the models with different numbers of predictors. Adjusted R2 is computed using the formula:
 where k is the number of predictors.

S – the estimated standard deviation of the error in the model.

MS (Mean Square) - the estimate of the variation accounted for by this term,


F - the F-test value for the model.

p-level - the significance level of the F-test. Values less than  (0.05) show that the model estimated by the regression procedure is significant.

VIF – variance inflation factor, measures the inflation in the variances of the parameter estimates due to collinearities among the predictors. It is used to detect multicollinearity problems. The larger the value is, the stronger the linear relationship between the predictor and remaining predictors. VIF equal to 1 indicates the absence of linear relationship with other predictors (there is no multicollinearity). VIF value between 1 and 5 indicates moderate multicollinearity, and values greater than 5 suggest that a high degree of multicollinearity is present. It is a subject of debate whether there is a formal value for determining presence of multicollinearity: in some situations even values greater than 10 can be safely ignored – when high values caused by complicated models with dummy variables or variables that are powers of other variables. But in weaker models even values above 2 or 3 may be a cause for concern: for example, for ecological studies Zuur, et al. (2010) recommended a threshold of VIF=3.

TOL - the tolerance value for the parameter estimates, it is defined as TOL = 1 / VIF.

Partial Correlations are correlations between each predictor and the outcome variable excluding the effect of other variables.


Hocking, R. R. (1976) "The Analysis and Selection of Variables in Linear Regression," Biometrics, 32

Nargundkar R. (2008) Marketing Research: Text and Cases. Third edition. Tata McGraw-Hill Publishing Company Ltd.

Neter, J., Wasserman, W. and Kutner, M. H. (1996). Applied Linear Statistical Models, Irwin, Chicago.

Zuur, A. F., Ieno, E. N. and Elphick, C. S. (2010), A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1: 3–14.