3

I have found myself Googling this question more than once: ¿What is the relationship between the standardized multiple regression coefficient (the standardized partial slope) and the corresponding semi-partial correlation (also called the part correlation by some)? The answers that I find are not helpful, because they always provide an answer restricted to a multiple regression model with at most 2 predictors.

Note: The following similar question (Multiple regression or partial correlation coefficient? And relations between the two) addresses partial correlations and this question (Is there a difference between semipartial correlation and regression coefficient in multiple regression?) focuses only on the trivariate model (as indicated above).

So, I will start here with a summary of the two straightforward models.

For bivariate regression $$y=\alpha + \beta· x + \epsilon$$ the standardized slope, $\beta^*$, equals the correlation between $x$ and $y$: $$\beta^* = r_{x,y}$$

For trivariate regression $$y = \alpha + \beta·x + \gamma·z + \epsilon$$ the standardized partial slope for $x$ can be written as $$\beta^* = \frac{r_{y,x} - r_{y,z}·r_{x,z}}{1-r_{x,z}^2}.$$ When taken with the fact that the semi-partial correlation $r_{y(x|z)}$ is given by the equation $$r_{y(x|z)} = \frac{r_{y,x} - r_{y,z}·r_{x,z}}{\sqrt{1-r_{x,z}^2}}$$ we can obtain the following relationship between the standardized slope and the semi-partial correlation $$\beta^* = \frac{r_{y(x|z)}}{\sqrt{1-r_{x,z}^2}}$$

For multiple regression models with 3 or more predictors $$y = \beta_0 + \beta_1 · x_1 + \beta_2 · x_2 + \cdots + \beta_k · x_k + \epsilon$$ where $k>2$. My question is whether the following formulation is correct for multiple regression models with 3 or more predictors: $$\beta_i^* = \frac{r_{y(i|\langle i\rangle)}}{\sqrt{1-R_{i,\langle i\rangle}^2}}$$ where $r_{y(i|\langle i\rangle)}$ is the semi-partial correlation for y with i partialling out all other variables but the $i$-th, and $R_{i,\langle i\rangle}^2$ is the coefficient of determination obtained from predicting the $i$-th predictor from all of the other predictors in the model. (The notation $\langle i\rangle$ indicates the set of all predictor variables but the $i$-th.)

I am fairly confident my answer is correct, but I have not located a website to confirm this. So, my questions are: (1) ¿Is this correct? (2) ¿Can someone provide a proof? or (3) ¿Can someone provide easily accessible references?

Gregg H
  • 3,571
  • 6
  • 25
  • 1
    It's been a while since you've asked this question: have you been able to locate an answer in the meantime? – Anthony May 20 '21 at 11:28
  • 1
    I have been playing with a couple of numerical examples, and your formula appears to be correct. – Anthony May 20 '21 at 12:31
  • 1
    I have not located any more information, and I haven't had the time to work out the math either...but I have been operating under the assumption that my rationale is correct. – Gregg H May 20 '21 at 15:49

1 Answers1

2

This paper lists an indirect way to convert between standardised $\beta$'s and partial and semi-partial coefficients via the $t$-statistic.

To wit, in a regression with $n$ samples and $p$ predictors, the partial-correlation coefficient of a predictor $X_f$ is given by $$ r_{\text{par}} = \frac{t_f}{\sqrt{t_f^2 + \text{df}}}\ , $$ where $t_f$ is the $t$-statistic of the regression coefficient $\beta_f$, and $\text{df}=n-p-1$. On the other hand, the semi-partial correlation coefficient is given by $$ r_{\text{s-par}} = \frac{t_f\sqrt{1-R^2}}{\sqrt{\text{df}}}\ , $$ where $R^2$ is for the full model, and the rest of the notation is as above.

Anthony
  • 182
  • 1
  • 12