2

I am fitting a regression model with ARIMA errors in R using the Arima function from the forecast package. I assume that the function takes all predictors from a matrix that I assign to the xreg argument. Thus regression is fitted using all of them and the output is produced accordingly.

Now, I appreciate that coefficients with high p-values are likely to have no impact on the overall outcome, however I would like to understand how I could fit a stepwise regression using Arima function.

On a side note, how would I go about fitting a regularised regression (LASSO or ridge) with ARIMA errors -- either through Arima function, or other means?

  • Stepwise is a terrible method of variable selection. This has been discussed here many times. – Peter Flom Jan 14 '19 at 12:21
  • @PeterFlom thank you for your comment -- I appreciate stepwise has some limitations, but my question is not about its advantages or disadvantages, it's about applying stepwise in Arima function. – Dmitry Ishutin Jan 14 '19 at 13:35
  • @PeterFlom: What method of variable selection would you recommend instead of the stepwise method? – Isabella Ghement Jan 14 '19 at 15:34
  • If you use auto.arima() instead of Arima() to fit your model, shouldn't that automatically do the model selection for you based on the AIC criterion? – Isabella Ghement Jan 14 '19 at 15:36
  • @IsabellaGhement, `auto.arima` automatically selects the structure of _ARIMA_ as per the custom stepwise algorithm outlined in [this paper](https://www.jstatsoft.org/article/view/v027i03/v27i03.pdf). It does not do _stepwise regression_ with subsequent ARIMA fitting. – Dmitry Ishutin Jan 14 '19 at 16:01
  • Then why not do your stepwise first for a regression model fitted with gls(), find the optimal model, and then worry about the error correlation structure (if any is left in the model residuals)? If you find any correlation structure, re-fit the gls() model to allow for correlation structure. Note that gls() comes from the nlme package and, unlike lm(), allows for autocorrelated model errors. – Isabella Ghement Jan 14 '19 at 16:05
  • I don't think the OP was referring to ARMAX models the more simple ARIMA model ( i.e. no user specified exogenous) . However if one has an ARMAX model to buid it might be erroneous to do stepdown (assuming a white noise error process underlying the t/f tests ) . One has to simultaneously resolve the X structure and the ARIMA component while dealing with latent (unspecifed) deterministic structure waiting to be identified ala TSAY . In short don't discard variables until the error process is resolved and "proven" to be white noise or at least free of identifiable structure . – IrishStat Jan 14 '19 at 16:26
  • oops .. apparently he was including regressors ....my bad but my comments still hold that stepdown ( deleting one or more predictors ) should be carefully avoided until the error process is free of structurei More importantly detecting lag effects of the predictors should be part of the model augmentation ... stepup comes to mind . – IrishStat Jan 14 '19 at 16:56
  • 1
    The best is substantive knowledge and a priori hypotheses. For automatic methods, I like LASSO. – Peter Flom Jan 14 '19 at 21:00
  • 1
    @IsabellaGhement, thank you for your inputs. I will try as you suggested to use `glm()`, but I have a highly seasonal data with multiple seasonal cycles, hence I know that residuals will always be seasonally correlated, therefore I am fitting SARIMA to tackle this. I am still looking forward to a tidy solution where both stepwise regression and ARIMA fitting could be done on the fly, like it's done in `Arima` function from the `forecast` package, which only fits a standard regression with all input variables. – Dmitry Ishutin Jan 16 '19 at 10:53
  • @IrishStat, thank you for your comments -- I appreciate stepwise has some limitations, but my question is not about its advantages or disadvantages, it's about applying stepwise in `Arima()`-like function – Dmitry Ishutin Jan 16 '19 at 10:57
  • I don'y think the arima function does stepdown or stepup to refine lag structure – IrishStat Jan 16 '19 at 12:39

1 Answers1

3

Stepdown deleting non-significant arima structure married with stepup remedies base upon the "current set of model errors" possibly incorporating both arima structure and waiting to be discovered Intervention variables https://pdfs.semanticscholar.org/09c4/ba8dd3cc88289caf18d71e8985bdd11ad21c.pdf is quite useful as is described here Is it possible to automate time series forecasting? and here where @Adamo opines Interrupted Time Series Analysis - ARIMAX for High Frequency Biological Data?

Essentially arima model identification starts with the assumption of a simple mean nodel and then recursively adds structure that appears to be evidented/needed and proceeds until signal is separated from noise including stepping up when evidence is present regarding lack of constancy of the model error variance or lack of constancy of parameters over time . Modelling time series is much like peeling an onion to get to the heart of the matter i.e. the underlying sufficient signal.

IrishStat
  • 27,906
  • 5
  • 29
  • 55
  • thank you for your answer, but I don't see how it answers my question. – Dmitry Ishutin Jan 14 '19 at 13:39
  • https://stats.stackexchange.com/questions/380599/is-it-possible-to-automate-time-series-forecasting/380634#380634 should be very valuable to you to help you formulate a modified ARIMA function as to my knowledge none exists in the world of free software. My answer to some extent was directed to modify @PeterFlom's reflections . Perhaps we can chat off line and you can better detail what you would like to accomplish . If you want to emulate the flow diagram you might have to write some code once you understood how to precisely do it. – IrishStat Jan 14 '19 at 14:46
  • Beautiful onion metaphor! It essentially applies to statistical modelling as a whole. – Isabella Ghement Jan 14 '19 at 15:37