Why can't I calculate the $R^2$ in some regression models if I use the method of maximum likelihood estimation?

Question

I've modeled two regression models the first is a multiple linear regression (OLS) $$Y=\beta_0+\beta_1X_1+\cdots+\beta_nX_n+e$$ and I can get its $R^2$. The second model is a spatial autoregressive model (SAR) $$Y=\rho W + \beta_0 + \beta_1 X_1 + \cdots+\beta_nX_n+e$$ where $W$ is the contiguity matrix and $\rho$ is an unknown parameter. This model is estimated by the method of maximum likelihood but I cannot calculate its $R^2$ and rather I have to use the $R^2$ Nalgerkerke. I've found this "There is no direct equivalent to the OLS R-squared, these models are fitted by maximum likelihood." from http://r-sig-geo.2731867.n2.nabble.com/How-to-calculate-squared-R-of-spatial-autoregressive-models-td5762576.html but I'd like to know why I cannot calculate $R^2$ for this model if the formula is just $$R^2=1-\frac{\sum(y_i-\hat{y}_i)^2}{\sum(y_i-\overline{y})^2}$$

"There is no direct equivalent" does not mean "you cannot calculate it." The former needs to be interpreted as a warning about the applicability and interpretation of $R^2.$ — whuber, Jun 28 '19 at 18:13
Alright but why? I mean I can calculate it for both but is the interpretation different or no equivalent? Do you know some paper or book where I could read more about this? Thanks for your answer. — Alexis Galois, Jun 28 '19 at 18:42
If you have a likelihood, then you can compute deviance. For Gaussian likelihood, deviance is the r-squared. Compare both formulas to understand if and why they differ. — Firebug, Aug 15 '21 at 12:39
Fit the model and find its predictions. What is $ \sum_i\Big[ (y_i - \hat{y_i})(\hat{y_i} - \bar{y}) \Big] $? If that sum is not zero (or some tiny number that close enough to zero for arithmetic on a computer), then $R^2$ [loses its usual "proportion of variance explained" interpretation](https://stats.stackexchange.com/questions/551915/interpreting-nonlinear-regression-r2). — Dave, Dec 16 '21 at 21:19

score 1 · Answer 1 · answered Jul 01 '19 at 03:06

First, the definition of $R^2$ originates from the decomposition formula, i.e. $$S_T=S_R+S_e,$$ where $S_T=\sum\limits_{i=1}^n(y_i-\bar y)^2$, $S_R=\sum\limits_{i=1}^n(\hat y_i-\bar y)^2$, $S_e=\sum\limits_{i=1}^n(y_i-\hat y_i)^2$. Assume that the matrix form of the multiple linear regression is $$Y_{n\times 1}=X_{n\times (p+1)}\beta_{(p+1)\times 1}+\varepsilon_{n\times 1},$$ the proof of the above decomposition formula uses the normal equations of the OLS estimator $\hat\beta_{OLS}$ $$X^T(Y-X\hat\beta_{OLS})=0_{(p+1)\times 1}.$$ Second, the MLE estimator $\hat\beta_{MLE}$ of $\beta$ doesn't satisfy the normal equations unless $\varepsilon_{n\times 1}$ follows $N(0_{n\times 1},I_n)$.

Do you know where I can check more about this? Thank you for your answer. — Alexis Galois, Jul 01 '19 at 18:50

Why can't I calculate the $R^2$ in some regression models if I use the method of maximum likelihood estimation?

1 Answers1