2

In Applied Linear Statistical Models (Kutner, Nachtsheim, Neter, Li) one reads the following on the coefficient of partial determination:

A coefficient of partial determination can be interpreted as a coefficient of simple determination. Consider a multiple regression model witht two $X$ variables. Suppose we regress $Y$ on $X_2$ and obtain the residuals: $$e_i(Y|X_2) = Y_i - \hat Y_i(X_2)$$ where $\hat Y_i(X_2)$ denotes the fitted values of $Y$ when $X_2$ is in the model. Suppose we further regress $X_1$ on $X_2$ and obtain the residuals: $$e_i(X_1|X_2) = X_{i1}-\hat X_{i1}(X_2)$$ where $\hat X_{i1}(X_2)$ denotes the fitted values of $X-1$ in the regression of $X_1$ on $X_2$. The coefficient of simple determination $R^2$ between these two sets or residuals equals the coefficient of partial determination $R^2_{Y 1| 2}$. Thus this coefficient measures the relation between $Y$ and $X_1$ when both of these variables have been adjusted for there linear relationships to $X_2$.

Where $$R^2_{Y1|2} = \frac{\text{SSR}(X_1|X_2)}{\text{SSE}(X_2)} = \frac{\text{SSE}(X_2) - \text{SSE}(X_1,X_2)}{\text{SSE}(X_2)}$$

But I don't understand what they mean. I interpret $R^2_{Y1|2}$ as the relative decrease in $\text{SSE}$ when $X_1$ is added to the model (already containing $X_2$).

I've tried to implement the statement in R:

set.seed(1)
n <- 100
e <- rnorm(n,0, 10)
beta0 <- 30
beta1 <- 3
beta2 <- 7
x1 <- rnorm(n,10, 4)
x2 <- rnorm(n,2, 5)

y <- rep(beta0,n) +  beta1*x1+beta2*x2 +e
SSTO <- sum( (y-mean(y))^2)

fit.Full <- lm(y~x1+x2)
summary(fit.Full)
R.squared.Full <- summary(fit.Full)$r.squared
SSE.x1x2 <- SSTO * (1 - R.squared.Full)

fit.reduced <- lm(y~x2)
summary(fit.reduced)
res1 <- y - as.vector(predict.lm(fit.reduced))
SSE.x2 <- sum(res1^2)

R.squared.partial <- 1 - SSE.x1x2/SSE.x2
## = 0.6203542

## Fitting the predictors
fit.x <- lm(x1~x2)
res2 <- x1-as.vector(predict.lm(fit.x))

plot(res1~res2)

fit.res <- lm(res1~res2)

summary(fit.res) 
## and indeed R^2 = 0.6204

Indeed, those values are equal!

But why? And how should I interpret this?

Like 62.04% of the variation in the residuals of the reduced model is explained by...'?

dietervdf
  • 1,132
  • 1
  • 9
  • 20
  • I believe your question might be fully answered at http://stats.stackexchange.com/questions/17336. – whuber Jan 10 '17 at 20:40
  • Also, from the citation it looks that coefficient of partial determination is logically similar to squared coefficient of partial correlation. So you might want to read http://stats.stackexchange.com/q/76815/3277. – ttnphns Jan 10 '17 at 21:13

0 Answers0