3

I know I can calculate partial or semi-partial correlations to measure the relative contribution of multiple independent variables to the variation in a dependent variable. However, if I only have access to standardized regression coefficients, how can I use these to measure the relative contribution of each independent variable?

Would it be sufficient to square the regression coefficients, and then claim that this accounts for their share of the variation in the independent variable?

Ideally I'd like to get to the point where I can claim that "IV X explains Z% of the variation in IV Y".

histelheim
  • 2,465
  • 4
  • 23
  • 40
  • Do you only have (estimated) coefficients from one model including all of the independent variables? Can you exclude one or more of the IVs and re-calculate coefficients? – vafisher Feb 26 '14 at 16:25

3 Answers3

3

@AndrewCassidy's comment describes the usual method of comparing the effects of two variables on a dependent variable -- compare the changes in $R^2$ after dropping each independent variable from the model. If you're an R user, you can do this easily and conveniently using the lm.sumSquares function from the lmSupport package.

However, I also wish to caution you against the use of change in $R^2$ as a measure of the relative importance of two or more variables. The first issue is that, of course, $R^2$ tells you nothing about the relative practical or causal importance of two or more variables -- practical or causal importance must be evaluated based on theoretical knowledge of the variables themselves.

Unfortunately, change in $R^2$ is not even all that useful for measuring the relative predictive importance of two or more variables. Changes in $R^2$ are a function of both the variance in the independent variable(s) and the variance in the dependent variable; thus, a small change in $R^2$ for a given independent variable may indicate either that the variable is unimportant for predicting your dependent variable or that you have a restricted range on your independent variable. For more information on this general phenomenon, I would direct you to this excellent answer by @whuber.

Patrick S. Forscher
  • 3,122
  • 23
  • 43
3

I would take the standardized coefficients are measures of the variables' contributions to the model. The coefficients are already in the units of measure of the dependent variable. The magnitude of the coefficient tells you how much the dependent variable will change when the independent variable changes by one standard error. This standard error is itself the measure of the variability of the independent variable, which is important, because by contribution to the model we mean not only the sensitivity to changed in the input but also so the variability of the inputs themselves.

as a side note, I'd be careful with using the abbreviation "IV", because in econometrics it is reserved for instrumental variable concept

Aksakal
  • 55,939
  • 5
  • 90
  • 176
1

Basically you're looking for some measure of feature importance? Please correct me if I'm wrong.

Lets say you have a basic regression

$Y$ ~ $B_0$ + $B_1$$X_1$ + $B_2$$X_2$ + ....

The sign on the beta values ($B_0$, $B_1$, ..) tells you the direction of effect of $X_i$ on $Y$. It is harder to interpret the size of the Beta values however. You can say given an increase in $X_i$ (assuming it is a numerical value), this translates to a $increase$*$B_i$ effect on $Y$. A better approach I've used before is to drop the $X$ value of interest from the regression equation and then measure the change in AIC or $R^2$ associated with the entire regression model. This quantifies the explanatory power of $X$ on $Y$ holding all other variables constant. This combined with the sign on the Beta value can give you a ton of insight.

Here is an example using code in R:

a = rnorm(200)
b = rnorm(200)
c = a +  b
df = as.data.frame(cbind(c, a, b))
summary(lm(c ~ a + b, df))$r.squared  - summary(lm(c ~ a, df))$r.squared

the output of the difference should be closed to .5 or 50%

Andrew Cassidy
  • 476
  • 3
  • 15
  • The issue at hand is how to compare the standardized regression coefficients with respect to the amount of variation they account for in the DV. Ideally I'd like to get to the point where I can claim that "IV X explains Z% of the variation in IV Y". – histelheim Feb 26 '14 at 14:55
  • 1
    $R^2$ is a measure of the amount of variation explained by the combination of all the independent variables. Dropping one of the independent variables and measuring the change in the $R^2$ would tell you: Variable IV X (that I dropped from the regression equation) explains delta $R^2$ amount of the variation – Andrew Cassidy Feb 26 '14 at 14:59