A Function to select a forecast method

Question

I often have more than one time series to fit a model. Thanks to forecast and forecastHybrid packages they make easy to fit a model to a time series. But often I have more than one ts. When the number of ts are sufficiently high it is a lot of work to do it one by one because the method I choose could do good job for some but it may do bad job for the others ts. To choose a model and to avoid the work I wrote the following code. It works but I would like to know :

Would I face any problem? Are there issues that I cannot see? (sure it is slow but except that)

Best

A

    choose_model <-  function(x,end_train,start_test){
  library(forecast)
  library(forecastHybrid)
  library(tidyverse)

  #train data

  x_train <- window(x, end = end_train )

  x_test <- window(x, start = start_test)

  h1=length(x_test)
  #model1

  stlf(x_train,method="arima",s.window= 12, h=h1)-> fc_stlf

  #model2
  auto.arima(x_train, stepwise = FALSE, approximation = FALSE)%>%forecast(h=h1) -> fc_arima

  #model3
  set.seed(12345)#for nnetar model
  nnetar(x_train)%>%forecast(h=h1) -> fc_nnetar

  #model4
  snaive(x_train,h=h1)->fc_snaive

  #model5
  hybridModel(x_train,models = "anst",weights = c("equal"),errorMethod = c("RMSE", "MAE", "MASE"),verbose=FALSE)%>%forecast(h=h1) -> fc_hy

  #model6
  hybridModel(x_train,models = "an",weights = c("equal"),errorMethod = c("RMSE", "MAE", "MASE"),verbose=FALSE)%>%forecast(h=h1) -> fc_hy_2

  #model7

  ets(x_train)%>%forecast(h=h1)->fc_ets


  #model8

  holt(x_train, h=h1)->fc_holt

  #model9

  hw(x_train,seasonal = "additive",  h=h1)->fc_hw_ad

  #model10

  hw(x_train,seasonal = "multiplicative", h=h1)->fc_hw_mul

  #model11

  hw(x_train,seasonal = "additive",damped = TRUE,  h=h1)->fc_hw_ad_dam


  #model12 
  hw(x_train,seasonal = "multiplicative",damped = TRUE, h=h1)->fc_hw_mul_dam

  #accuracy


  model1 <- accuracy(fc_stlf$mean,x_test)[2]

  model2 <- accuracy(fc_arima$mean,x_test)[2]

  model3 <- accuracy(fc_nnetar$mean,x_test)[2]

  model4 <- accuracy(fc_snaive$mean,x_test)[2]

  model5 <- accuracy(fc_hy$mean,x_test)[2]

  model6 <- accuracy(fc_hy_2$mean,x_test)[2]

  model7 <- accuracy(fc_ets$mean,x_test)[2]

  model8 <- accuracy(fc_holt$mean,x_test)[2]

  model9 <- accuracy(fc_hw_ad$mean,x_test)[2]

  model10 <- accuracy(fc_hw_mul$mean,x_test)[2]

  model11 <- accuracy(fc_hw_ad_dam$mean,x_test)[2]

  model12 <- accuracy(fc_hw_mul_dam$mean,x_test)[2]

  best_model <- min(c(model1,model2,model3,model4,model5,model6,model7,model8,model9,model10,model11,model12))



  if(best_model==model1){
    print(fc_stlf$model)
  }
  if(best_model==model2){


    print(fc_arima$model)
  }
  if(best_model==model3){

    print(fc_nnetar$model)
  }
  if(best_model==model4){

    print(fc_snaive$model)
  }
  if(best_model==model5){


    print(fc_hy$model)
  }
  if(best_model==model6){


    print(fc_hy_2$model)
  }

  if(best_model==model7){


    print(fc_ets$model)
  }

  if(best_model==model8){


    print(fc_holt$model)
  }

  if(best_model==model9){


    print(fc_hw_ad$model)
  }

  if(best_model==model10){


    print(fc_hw_mul$model)
  }

  if(best_model==model11){


    print(fc_hw_ad_dam$model)
  }

  if(best_model==model12){


    print(fc_hw_mul_dam$model)
  }

}
choose_model_monthly(my_data,7,c(2018,01),c(2018,02))

Edit

Thank you for your answer @Tim and for your comment forecaster, here is my new code. I will be happy if you give me some feedback. Thank you all.

choose_model <- function(x,h,reg,new_reg,end_train,start_test){
  library(forecast)
  library(forecastHybrid)
  library(tidyverse)

  #train data

  x_train <- window(x, end = end_train )

  x_test <- window(x, start = start_test)

  #train and test for regressors

  reg_train <- window(reg, end = end_train )

  reg_test <- window(reg, start = start_test) 

  h1=length(x_test)

  #model1

  stlf(x_train , method="arima",s.window= nrow(x_train),xreg = reg_train, newxreg = reg_test, h=h1)-> fc_stlf_xreg

  #model2
  auto.arima(x_train, stepwise = FALSE, approximation = FALSE,xreg=reg_train)%>%forecast(h=h1,xreg=reg_test) -> fc_arima_xreg

  #model3
  set.seed(12345)#for nnetar model
  nnetar(x_train, MaxNWts=nrow(x), xreg=reg_train)%>%forecast(h=h1, xreg=reg_test) -> fc_nnetar_xreg

  #model4
  stlf(x_train , method= "ets",s.window= 12, h=h1)-> fc_stlf_ets

  #Combination

  mod1 <- lm(x_test ~ 0 + fc_stlf_xreg$mean + fc_arima_xreg$mean + fc_nnetar_xreg$mean + fc_stlf_ets$mean)
  mod2 <- lm(x_test/I(sum(coef(mod1))) ~ 0 + fc_stlf_xreg$mean + fc_arima_xreg$mean + fc_nnetar_xreg$mean + fc_stlf_ets$mean)



  #model1

  stlf(x, method="arima",s.window= 12,xreg=reg, newxreg=new_reg, h=h)-> fc_stlf

  #model2
  auto.arima(x, stepwise = FALSE, approximation = FALSE,xreg=reg)%>%forecast(h=h,xreg=new_reg) -> fc_arima

  #model3
  set.seed(12345)#for nnetar model
  nnetar(x, MaxNWts=nrow(x), xreg=reg)%>%forecast(h=h, xreg=new_reg) -> fc_nnetar

  #model4
  stlf(x , method= "ets",s.window= 12, h=h)-> fc_stlf_e

  #Combination

  Combi <- (mod2$coefficients[[1]]*fc_stlf$mean + mod2$coefficients[[2]]*fc_arima$mean +
              mod2$coefficients[[3]]*fc_nnetar$mean + mod2$coefficients[[4]]*fc_stlf_e$mean)

  return(Combi)
}

Doesn't `auto.arima()` from the `forecast` package already choose the best model automatically for you based on AIC? — Mihael, Jun 25 '18 at 15:32
Ok, so long as you are aware that a flat forecast is not a problem in itself, as there is a reason for it - which you have stated yourself - when there is no trend or seasonality, a flat forecast is perfectly adequate. — Mihael, Jun 25 '18 at 15:56

Tim · Accepted Answer · 2018-06-30T11:28:40.043

Don't choose, aggregate. If you can afford to fit multiple forecasts, then you also can aggregate them. You can take simple arithmetic average, or do something more clever, but on both cases should work well.

The M4 forecasting competition has just ended (see Makridakis, Spiliotis and Assimakopoulos, 2018) and the results show that combinations of forecasts were most common among the best performing solutions. While the winning solution by Slawek Smyl was a hybrid approach that combined exponential smoothing and recurrent neural networks in a single model, the second best solution was by the Montero-Manso, Talagala, Hyndman, and Athanasopoulos that used a weighted combination of standard forecasting methods, where the weights were learned by XGBoost algorithm. Many of other top scoring solutions used some kind of combined forecasts. There was also another solution by the team that included Rob Hyndman that used FFORMS algorithm implemented in their seer package, which was choosing among multiple forecast by using Random Forest (you can here more about both solutions in here), but it was further away in the ranking.

Results of M4 fit many other similar competitions and surveys that show that combinations of forecasts usually perform better then any single forecasting method alone. Same is true about most of the winning Kaggle solutions.

On another hand, if you have multiple series that come from same hierarchy (departments of store etc.), then it could be reasonable to use hierarchical forecast (see slides, book chapter, paper, and corresponding hts package by Rob Hyndman et al), since it should work better then making only the individual forecasts alone.

^{Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). The M4 Competition: Results, findings, conclusion and way forward. International Journal of Forecasting.}

+1 use simple median or mean to aggregate instead of machine learning, they work very well or at least consider them as benchmark. — forecaster, Jun 30 '18 at 12:50
Thank you for your answer Tim and for your comment @forecaster , my new code below as an answer. Thank you, and I would be happy if you could give me a feedback. — Econ_matrix, Jul 30 '18 at 15:40

A Function to select a forecast method

1 Answers1

Linked