43

I don't even know if this question makes sense, but what is the difference between multiple regression and partial correlation (apart from the obvious differences between correlation and regression, which is not what I am aiming at)?

I want to figure out the following:
I have two independent variables ($x_1$, $x_2$) and one dependent variable ($y$). Now individually the independent variables are not correlated with the dependent variable. But for a given $x_1$ $y$ decreases when $x_2$ decreases. So do I analyze that by means of multiple regression or partial correlation?

edit to hopefully improve my question: I am trying to understand the difference between multiple regression and partial correlation. So, when $y$ decreases for a given $x_1$ when $x_2$ decreases, is that due to the combined effect of $x_1$ and $x_2$ on $y$ (multiple regression) or is it due to removing the effect of $x_1$ (partial correlation)?

ttnphns
  • 51,648
  • 40
  • 253
  • 462
user34927
  • 385
  • 1
  • 4
  • 12

2 Answers2

43

Multiple linear regression coefficient and partial correlation are directly linked and have the same significance (p-value). Partial r is just another way of standardizing the coefficient, along with beta coefficient (standardized regression coefficient)$^1$. So, if the dependent variable is $y$ and the independents are $x_1$ and $x_2$ then

$$\text{Beta:} \quad \beta_{x_1} = \frac{r_{yx_1} - r_{yx_2}r_{x_1x_2} }{1-r_{x_1x_2}^2}$$

$$\text{Partial r:} \quad r_{yx_1.x_2} = \frac{r_{yx_1} - r_{yx_2}r_{x_1x_2} }{\sqrt{ (1-r_{yx_2}^2)(1-r_{x_1x_2}^2) }}$$

You see that the numerators are the same which tell that both formulas measure the same unique effect of $x_1$. I will try to explain how the two formulas are structurally identical and how they are not.

Suppose that you have z-standardized (mean 0, variance 1) all three variables. The numerator then is equal to the covariance between two kinds of residuals: the (a) residuals left in predicting $y$ by $x_2$ [both variables standard] and the (b) residuals left in predicting $x_1$ by $x_2$ [both variables standard]. Moreover, the variance of the residuals (a) is $1-r_{yx_2}^2$; the variance of the residuals (b) is $1-r_{x_1x_2}^2$.

The formula for the partial correlation then appears clearly the formula of plain Pearson $r$, as computed in this instance between residuals (a) and residuals (b): Pearson $r$, we know, is covariance divided by the denominator that is the geometric mean of two different variances.

Standardized coefficient beta is structurally like Pearson $r$, only that the denominator is the geometric mean of a variance with own self. The variance of residuals (a) was not counted; it was replaced by second counting of the variance of residuals (b). Beta is thus the covariance of the two residuals relative the variance of one of them (specifically, the one pertaining to the predictor of interest, $x_1$). While partial correlation, as already noticed, is that same covariance relative their hybrid variance. Both types of coefficient are ways to standardize the effect of $x_1$ in the milieu of other predictors.

Some numerical consequences of the difference. If R-square of multiple regression of $y$ by $x_1$ and $x_2$ happens to be 1 then both partial correlations of the predictors with the dependent will be also 1 absolute value (but the betas will generally not be 1). Indeed, as said before, $r_{yx_1.x_2}$ is the correlation between the residuals of y <- x2 and the residuals of x1 <- x2. If what is not $x_2$ within $y$ is exactly what is not $x_2$ within $x_1$ then there is nothing within $y$ that is neither $x_1$ nor $x_2$: complete fit. Whatever is the amount of the unexplained (by $x_2$) portion left in $y$ (the $1-r_{yx_2}^2$), if it is captured relatively highly by the independent portion of $x_1$ (by the $1-r_{x_1x_2}^2$), the $r_{yx_1.x_2}$ will be high. $\beta_{x_1}$, on the other hand, will be high only provided that the being captured unexplained portion of $y$ is itself a substantial portion of $y$.


From the above formulas one obtains (and extending from 2-predictor regression to a regression with arbitrary number of predictors $x_1,x_2,x_3,...$) the conversion formula between beta and corresponding partial r:

$$r_{yx_1.X} = \beta_{x_1} \sqrt{ \frac {\text{var} (e_{x_1 \leftarrow X})} {\text{var} (e_{y \leftarrow X})}},$$

where $X$ stands for the collection of all predictors except the current ($x_1$); $e_{y \leftarrow X}$ are the residuals from regressing $y$ by $X$, and $e_{x_1 \leftarrow X}$ are the residuals from regressing $x_1$ by $X$, the variables in both these regressions enter them standardized.

Note: if we need to to compute partial correlations of $y$ with every predictor $x$ we usually won't use this formula requiring to do two additional regressions. Rather, the sweep operations (often used in stepwise and all subsets regression algorithms) will be done or anti-image correlation matrix will be computed.


$^1$ $\beta_{x_1} = b_{x_1} \frac {\sigma_{x_1}}{\sigma_y}$ is the relation between the raw $b$ and the standardized $\beta$ coefficients in regression with intercept.


Addendum. Geometry of regression $beta$ and partial $r$.

On the picture below, a linear regression with two correlated predictors, $X_1$ and $X_2$, is shown. The three variables, including the dependent $Y$, are drawn as vectors (arrows). This way of display is different from usual scatterplot (aka variable space display) and is called subject space display. (You may encounter similar drawings locally here, here, here, here, here, here, here and in some other threads.)

enter image description here

The pictures are drawn after all the three variables were centered, and so (1) every vector's length = st. deviation of the respective variable, and (2) angle (its cosine) between every two vectors = correlation between the respective variables.

$Y'$ is the regression prediction (orthogonal projection of $Y$ onto "plane X" spanned by the regressors); $e$ is the error term; $\cos \angle{Y Y'}={|Y'|}/|Y|$ is the multiple correlation coefficient.

The skew coordinates of $Y'$ on the predictors $X1$ and $X2$ relate their multiple regression coefficients. These lengths from the origin are the scaled $b$'s or $beta$'s. For example, the magnitude of the skew coordinate onto $X_1$ equals $\beta_1\sigma_Y= b_1\sigma_{X_1}$; so, if $Y$ is standardized ($|Y|=1$), the coordinate = $\beta_1$. See also.

But how to obtain an impression of the corresponding partial correlation $r_{yx_1.x_2}$? To partial out $X_2$ from the other two variables one has to project them on the plane which is orthogonal to $X_2$. Below, on the left, this plane perpendicular to $X_2$ has been drawn. It is shown at the bottom - and not on the level of the origin - simply in order not to jam the pic. Let's inspect what's going on in that space. Put your eye to the bottom (of the left pic) and glance up, $X_2$ vector starting right from your eye.

enter image description here

All the vectors are now the projections. $X_2$ is a point since the plane was produced as the one perpendicular to it. We look so that "Plane X" is horizontal line to us. Therefore of the four vectors only (the projection of) $Y$ departs the line.

From this perspective, $r_{yx_1.x_2}$ is $\cos \alpha$. It is the angle between the projection vectors of $Y$ and of $X_1$. On the plane orthogonal to $X_2$. So it is very simple to understand.

Note that $r_{yx_1.x_2}=r_{yy'.x_2}$, as both $Y'$ and $X_1$ belong to "plane X".

We can trace back the projections on the right picture back on the left one. Find that $Y$ on the right pic is $Y\perp$ of the left, which is the residuals of regressing $Y$ by $X_2$. Likewise, $X_1$ on the right pic is $X_1\perp$ of the left, which is the residuals of regressing $X_1$ by $X_2$. Correlation between these two residual vectors is $r_{yx_1.x_2}$, as we know.

ttnphns
  • 51,648
  • 40
  • 253
  • 462
  • 1
    Thank you. But how do I decide which one to go with, e.g. for the purpose described in my question? – user34927 Nov 17 '13 at 21:38
  • 2
    Obviously, you are free to choose: the numerators are the same, so they convey _the same_ information. As for your (not fully clarified) question, it seems to be about topics "can regr. coef. be 0 when _r_ isn't 0"; "can regr. coef. be not 0 when _r_ is 0". There's a lot questions about that on the site. Just for example, you might read http://stats.stackexchange.com/q/14234/3277; http://stats.stackexchange.com/q/44279/3277. – ttnphns Nov 17 '13 at 22:00
  • I tried to clarify my question.. – user34927 Nov 17 '13 at 22:44
  • Fixing X1 ("x1 given") = removing (controlling) the effect of X1. There is no such thing as "combined effect" in multiple regression (unless you add the interaction X1*X2). Effects in multuple regression are competitive. Linear regression effects are actually partial correlations. – ttnphns Nov 17 '13 at 23:00
  • So, just to make sure I understood you correctly. If I want to prove that the DV is significantly correlated with one of two IVs if the effect of the other IV is removed, I can use either multiple regression or partial correlation and both lead me to the same conclusion (i.e. DV is low for given IV2 when IV1 is low) and I can show that by giving the partial r OR the Rsquared of the multiple regression? – user34927 Nov 19 '13 at 13:50
  • Once again - please be attentive to reread the answer. Multiple linear regression coefficient _b_ (or its standardized form _beta_) associated with a predictor has the same meaning and the same p-value as the _partial r_ between Y and the predictor. Now, if you got it, why are speaking of **Rsquared of the regression**? It has nothing to do with partial r of a specific predictor. – ttnphns Nov 19 '13 at 14:12
  • Okay, obviously I do not get it. Thanks anyway. – user34927 Nov 19 '13 at 14:16
  • 1
    Wait a bit, @user34927. `to prove that the DV (Y) is significantly correlated with one of two IVs (X1) if the effect of the other IV (X2) is removed` The effect removed from **where**? If you "remove" X2 from both Y and X1 then the corr. between Y and X1 is the _partial_ correlation. If you "remove" X2 from X1 only then the corr. between Y and X1 is called the _part_ (or semi-partial) correlation. Were you really asking about _it_? – ttnphns Nov 19 '13 at 15:02
  • All I know is that individually neither X1 nor X2 are significantly correlated with Y. But I know that Y is determined by _the combination_ of X1 and X2. I was now looking for the correct way to statistically analyze that relationship. But I do not understand if multiple regression or partial correlation is appropriate for that purpose and what the difference between those is in that case. – user34927 Nov 19 '13 at 15:28
  • 1) `significantly` you shouldn't rely on p-value in such problems as yours; the very size of the correlation matters more. 2) You have to define your understanding of "combination" before you can post a question (when you do it, please post a new, another question. Let this old one be as is). – ttnphns Nov 19 '13 at 15:34
  • Individually the Rsquared of X1 and X2 with Y is <0.1 and in a multiple regression of X1, X2 and Y it is 0.4. I might be misinterpreting what this means so it is probably a good idea to let this go. Thanks for your help. – user34927 Nov 19 '13 at 15:49
  • This may happen in suppression http://stats.stackexchange.com/q/73869/3277 – ttnphns Nov 19 '13 at 15:56
  • Why is there a square root in the denominator of the formula for $\beta_{x_1}$? I tried to obtain this formula by applying the standard equation $\beta = (X^\top X)^{-1} X^\top y$ for the case when $X$ has two standardized columns, and I get the same formula but without a square root. Have I made a mistake, or have you made a mistake? :) – amoeba Mar 31 '15 at 14:51
  • @amoeba, thank you very much for noticing that lapse! – ttnphns Mar 31 '15 at 17:28
  • I think this answer would be improved (potentially less confusing to future readers) if it showed the formula for $b$ as well as $\beta$ – Silverfish Sep 23 '17 at 10:15
0

Just bumped to this tread by chance. In the original answer, in the formula for $\beta_{x_1}$ the factor $\sqrt{SSY/SSX_1}$ is missing, that is $$ \beta_{x_1} = \frac{r_{yx_1} - r_{y x_2} ~r_{x_1 x_2}} {1-r^2_{x_1 x_2}} \times \sqrt{\frac{SSY}{SSX_1}}, $$ where $SSY=\sum_i (y_i-\bar y)^2$ and $SSX_1 = \sum_i {(x_{1i} - \bar{x}_1)^2}$.

Brani
  • 1
  • 1