$R^2$ in multivariate regression

Question

I'm trying to determine how the population $R^2$ value is defined in the multivariate regression model where we have

$Y_i = \mu_y + B^\prime(X_i - \mu_x) + err$

Where $Y_i \in \mathbb{R}^q$ and $X_i \in \mathbb{R}^p$ and $q, p>1$

Do you have a multivariate response variable (e.g., predicting blood pressure and pulse) or just multiple predictors? — Dave, May 26 '21 at 00:45
@Dave $Y_i$ is a vector in the question, so it's genuine multivariate linear regression. Some assumptions wouldn't hurt though (regarding independence of residuals and samples, mostly) — Firebug, May 26 '21 at 01:24

score 1 · Answer 1 · answered May 26 '21 at 01:23

For all purposes, a regression like that amounts to several independent multiple regressions. You'd end with $q$ $R^2$ values.

Another way of looking into it, however, is seeing that, for multiple linear regression, $R^2$ is a comparison between deviances. See this answer.

If you specify you model as being a conditional Multivariate Gaussian likelihood, then you can calculate a single $R^2$ (quoting from aforementioned answer):

$$\begin{matrix} \text{Null Deviance} \quad \quad \text{ } \text{ } & & \text{ } D_{TOT} = 2(\hat{\ell}_{S} - \hat{\ell}_0), \\[6pt] \text{Explained Deviance} & & D_{REG} = 2(\hat{\ell}_{p} - \hat{\ell}_0), \\[6pt] \text{Residual Deviance}^\dagger \text{ } & & \text{ } D_{RES} = 2(\hat{\ell}_{S} - \hat{\ell}_{p}). \\[6pt] \end{matrix}$$ In these expressions the value $\hat{\ell}_S$ is the maximised log-likelihood under a saturated model (one parameter per data point), $\hat{\ell}_0$ is the maximised log-likelihood under a null model (intercept only), and $\hat{\ell}_{p}$ is the maximised log-likelihood under the model (intercept term and $p$ coefficients).

In your case: $y_{q\times 1} \sim MVN(\beta_{q\times p} X_{p \times 1}, \Sigma_{q\times q})$. Let $\hat y = E[y] = \beta x$. Let $Y$ be the concatenation of all $y$, and similarly for $X$. You can derive the maximum likelihood estimates and compute the following:

$$R^2 = 1-\frac{D_{RES}}{D_{TOT}} = 1-\frac{(\hat{\ell}_{S} - \hat{\ell}_{p})}{(\hat{\ell}_{S} - \hat{\ell}_0)}$$

$R^2$ in multivariate regression

1 Answers1