Multiple Regression

Multiple regression is a statistical technique that explores how several independent (predictor) variables influence a single dependent (criterion) variable. It’s like understanding how different ingredients in a recipe affect the final dish’s taste. In multiple regression, we predict the outcome (dependent variable) based on the values of two or more factors (independent variables), using a special equation:

y=b1x1+b2x2+…+…+bnxn​+c

In this equation:

  • y represents the outcome we’re trying to predict.
  • bi​ (where i ranges from 1 to n) are the regression coefficients. These coefficients show how much the outcome changes with a one-unit change in the predictor variables.
  • xi​ are the predictor variables, the factors you think affect the outcome.
  • c is the constant term, the value of y when all xi​ are 0.

For example, consider predicting a student’s exam score based on their study habits, nutrition, and sleep. Multiple regression helps us understand how each of these factors contributes to the student’s performance.

To perform multiple regression in SPSS, you navigate through the menu: Analyze → Regression → Linear. This process allows you to input your variables and analyze their relationships.

Key Assumptions in Multiple Regression:

  1. Model Specification: Ensure the model includes all relevant variables and accurately reflects the relationships being studied.
  2. Linearity: The relationship between the predictors and the outcome should be linear.
  3. Normality: The variables involved should follow a normal distribution.
  4. Homoscedasticity: The variance (spread of values) should be consistent across all levels of the predictors.

Understanding Key Terms:

Adjusted R²: This adjusts the R² to account for the number of predictors in the model, providing a more accurate measure if the model were applied to new data.

In essence, multiple regression allows us to dissect the influence of various factors on an outcome, providing insights into how changes in these factors might affect the result. This makes it a powerful tool for prediction and understanding complex relationships.

Beta Value: This measures the impact of predictor variables on the outcome, standardized to compare the effects of variables measured on different scales.

R (Correlation Coefficient): This indicates the strength and direction of the relationship between observed and predicted values of the outcome.

R² (Coefficient of Determination): This value tells us the percentage of the outcome variable’s variance that’s explained by the predictor variables. It’s like saying, “This much of the outcome can be predicted by our model.”

Resources

Achen, C. H. (1982). Interpreting and using regression. Newbury Park, CA: Sage Publications.

Afifi, A. A., Kotlerman, J. B., Ettner, S. L., & Cowan, M. (2007). Methods for improving regression analysis for skewed continuous or counted responses. Annual Review of Public Health, 28, 95-111.

Aguinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford Press.

Algina, J., & Olejnik, S. (2003). Sample size tables for correlation analysis with applications in partial correlation and multiple regression analysis. Multivariate Behavioral Research, 38(3), 309-323.

Allison, P. D. (1999). Multiple regression. Thousand Oaks, CA: Pine Forge Press.

Anderson, E. B. (2004). Latent regression analysis based on the rating scale model. Psychological Science, 46(2), 209-226.

Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: Identification influential data and sources of collinearity.New York: John Wiley & Sons.

Berk, R. A. (2003). Regression analysis: A constructive critique. Thousand Oaks, CA: Sage Publications.

Berry, W. D. (1993). Understanding regression assumptions. Newbury Park, CA: Sage Publications.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.

Fox, J. (1991). Regression diagnostics. Newbury Park, CA: Sage Publications.

Fox, J. (2000a). Nonparametric simple regression: Smoothing scatterplots. Thousand Oaks, CA: Sage Publications.

Fox, J. (2000b). Multiple and generalized nonparametric regression. Thousand Oaks, CA: Sage Publications.

Hardy, M. A. (1993). Regression with dummy variables. Newbury Park, CA: Sage Publications.

Jaccard, J. (2001). Interaction effects in logistic regression. Thousand Oaks, CA: Sage Publications.

Kahane, L. H. (2001). Regression basics. Thousand Oaks, CA: Sage Publications.

Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications.

Miles, J., & Shevlin, M. (2001). Applying regression and correlation: A guide for students and researchers. Thousand Oaks, CA: Sage Publications.

Pedhazur, E. J. (1997). Multiple regression in behavioral research (3rd ed.). Fort Worth, TX: Harcourt Brace.

Schroeder, L. D., Sjoquist, D. L., & Stephan, P. E. (1986). Understanding regression analysis: An introductory guide. Newbury Park, CA: Sage Publications.

Serlin, R. C., & Harwell, M. R. (2004). More powerful tests of predictor subsets in regression analysis under nonnormality. Psychological Methods, 9(4), 492-509.

Related Pages: