Processing math: 100%

statsandstuff a blog on statistics and machine learning

Interpreting regression coefficients

Suppose we regress a response Y on covariates Xj for j=1p.  In linear regression, we get the model

Y=β0+β1X1+βpXp.

How do we interpret βj=0?  Does it mean that the jth covariate is uncorrelated with the response? The answer is no!  It means the jth covariate is uncorrelated with the response after we control for the effects of the other covariates.  A neat way to see this is to note the following way to compute the coefficient βj.  For notational convenience, we assume j=1.

Regress Y against the covariates X2,,Xp, and compute the residuals.  These residuals describe the part of the response Y not explained by regression on the covariates X2,,Xp . Regress X1 against the covariates X2,,Xp, and get the residuals.  These residuals describe the part of the regressor X1 not explained by the covariates X2,,Xp. We form an added-variable plot for X1 after X2,,Xp by plotting the residuals from step 1 against the residuals from step 2.  The slope of the regression line in the added-variable plot, which describes the relation between Y and X1 after controlling for the other covariates, is equal to the coefficient β1. For a concrete example, suppose we regress a person’s income against their height and age, and find that βheight is not significantly different from 0.  We should interpret this as there is no relationship between income and height, after we adjust for age.