1

I've recently read something about pseudo $R^2$ calculation for ARIMA models. I know it's not the best practice for estimate "prediction power" of the model (I'm far more interested in out-of sample measures indeed...), but it's hard to explain that to business major academics. So to make my life easier I'm going to add this statistics in my Master thesis.

Coming back to $R^2$, many noticed how (if I grasp it right):

  • $R^2$ is not uniquely definite for non-linear regressions (e.g. ARMA with MA components)
  • $R^2$ interpretation can be misleading if no drift is present in the model.

Now, I have 3 ARIMA model to examine and one of them has $\mu=0$ (and the null hypothesis cannot be rejected neither at $0.1$% level, so it is clearly zero). Despite, I computed even for it also the $R^2$ statistics.

Questions:

  1. Will my model with "no drift" be not suitable for $R^2$ computation? Why?
  2. Is it just a matter of precision or its value is completely nonsense and useless in this case?
toyo10
  • 71
  • 1
  • 10
  • I disagree with both of the bullet points. ARMA being a linear model, $R^2$ is uniquely defined (SSExplained/SSTotal). Drift or no drift, $R^2$ tells the share of variance explained by the model. I wonder who are those many who made these bullet points... – Richard Hardy Apr 28 '18 at 18:48
  • @Richard Hardy Thank you for your insight. I'm referring to [this](https://stats.stackexchange.com/a/8751/205019), and [this](https://stats.stackexchange.com/a/101580/205019) answers. What am I missing? – toyo10 Apr 29 '18 at 18:19
  • 1
    I find Rob J. Hyndman's thoughts on $R^2$ a little confusing. When we have a dependent variable $y$ and a linear model for it such as ARIMA (you can express $y$ as a linear combination of lagged autoregressive terms and moving-average terms even though it is not a regression model), I find it natural to stick to SSExplained/SSTotal definition of $R^2$, and Alecos Papadopoulos seems to agree with me. But I am confused by his notion of drift since in time series literature, a constant in ARIMA models is not called drift unless the order of integration is 1. – Richard Hardy May 01 '18 at 14:40
  • 1
    I also disagree with him that in-sample fit vs. out-of-sample forecasting performance is something specific to time series models. All models are aimed at generalizing beyond the sample. His discussion that follows also applies equally well to, say, the cross-sectional setting. The only problematic point I see is that in a model without the intercept, the $R^2$ no longer measures the ratio of explained variance around the mean to total variance but explained variance around zero to total variance. If the mean is not very close to zero, this becomes a problem. – Richard Hardy May 01 '18 at 14:42
  • I would suggest replacing "drift" with "intercept" in your post if what you mean is the constant in the ARIMA model. As I mentioned above, intercept becomes drift in ARIMA(p,1,q) models but not in other ones. – Richard Hardy May 01 '18 at 14:46

0 Answers0