2

I´m doing an auto.arima to a time-serie and as a result, i get the model but with non normal residuals (jarque.bera.test). Can i use this model? Or it´s necessary to have normal residuals?

Also for making auto.arima odo i have to use an stacionary serie? I´m using thenon-stacionary with logs. My class exercise did that.

I´m using R-studio 1.1.463 and Windows 10.

emp.ts.log is the time-serie with log.

arima.auto
hist(arima.auto$residuals)
Acf(arima.auto$residuals)
Pacf(arima.auto$residuals)
library(tseries)
jarque.bera.test(arima.auto$residuals)

I expected a p-value >0.05 at Jarque.Bera.Test but i get

data: arima.auto$residuals X-squared = 4101.2, df = 2, p-value < 2.2e-16.

Thank you so much

Claudia C
  • 23
  • 4

1 Answers1

2

Why would you think that they should be normal OR even independent of each other Or have a constant error variance over time ? auto.arima does not perform tests of parameter significance so there is no need for distributional concerns unless one is concerned about confidence limits for forecasts , as we all should be .

auto.arima simply fits a set of presumed models and that is not modelling in the larger sense it is simply fitting and picks the best of the set that was tried.

That is not what Box & Jenkins had in mind while this is closer https://autobox.com/pdfs/ARIMA%20FLOW%20CHART.pdf reflecting an iterative self-checking process culminating in separating data to signal (the forecast) and noise (the random component).

The problem you are having is probably due to auto.arima not dealing well with the presence of the deterministic structure yielding large errors resulting in a skewed (non-normal) distribution .

"The correlogram should be calculated from residuals using a model that controls for intervention administration, otherwise the intervention effects are taken to be Gaussian noise, underestimating the actual autoregressive effect."

In other words for auto.arima to be useful you needed to have the following circumstances.

1) a series with no pulses,level shifts,seasonal pulses or deterministic time structure like trends et al .

2) a series where the parameters for the underlying arima model are constant over time

3) a series where the error variance of the underlying arima model does not change deterministically at different time points.

failing one or more of these assumptions is probably what caused your conclusion

Bye the way one doesn't willy-nilly use a power transform as there can be negative side-effects i.e. unexpected consequences . See When (and why) should you take the log of a distribution (of numbers)?

IrishStat
  • 27,906
  • 5
  • 29
  • 55