0

I am learning forecasting myself and created this sample algorithm

library(e1071)
library(forecast)
library(xts)

set.seed(3)
Ex <- xts(1:100, Sys.Date()+1:100)
data = data.frame(Ex,matrix(rnorm(100*2,mean=123,sd=3), nrow=100))
accu<-list()
ama=auto.arima(data$X1)
et=ets(data$X1)
rw=rwf(data$X1)
nvw=naive(data$X1)
mn=meanf(data$X1)
models = list(ama,et,rw,nvw,mn)
# 30 day forecast with 75%,85%,95%,97%, and 99% confidence bands
layout(matrix(1:6,nr=3))
for(i in 1:length(models)) {
  fc<- forecast(models[[i]], 30, level = c(75,85,95,97,99))
  accu[[i]]<-accuracy(fc)  
  plot(fc, main = names(models)[[i]])     
}   
accu

but whatever model I chose I am getting a flat and similar values for all my models. I have expected some kind of change in my forecasting values.

E.g:

  Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
1        123.0331 119.7031 126.3631 117.9118 128.1544
2        123.0331 119.7031 126.3631 117.9118 128.1544
3        123.0331 119.7031 126.3631 117.9118 128.1544
4        123.0331 119.7031 126.3631 117.9118 128.1544
5        123.0331 119.7031 126.3631 117.9118 128.1544

enter image description here

Where am I making the mistake?

Also a small coding question, How can I change X axis from number of days to the Date (2016-04-1)?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Eka
  • 1,921
  • 2
  • 22
  • 28
  • This construction `data.frame(Ex,matrix(rnorm(100*2,mean=123,sd=3), nrow=100))` is very bad. –  Jan 13 '16 at 07:22
  • 1
    @Pascal, could you indicate why? That would make your comment more helpful and valuable. – Richard Hardy Jan 13 '16 at 08:24
  • 1
    @RichardHardy Because you loose the time attributes of `Ex`. Compare `class(Ex)`, which gives `"xts" "zoo"` and `sapply(data, class)`, giving `"integer" "numeric" "numeric"`. This is clearly not what the OP is think it is. One way should be `data = cbind(Ex,matrix(rnorm(100*2,mean=123,sd=3), nrow=100))`, which preserves the time attributes. `class(data[,2])` gives `"xts" "zoo"`. –  Jan 13 '16 at 08:36

1 Answers1

2

Your data is white noise:

rnorm(100*2,mean=123,sd=3)

There is no trend, no seasonality, no autoregressive or moving average behavior, no conditional heteroskedasticity, no nothing.

Under these circumstances, there are no dynamics you can model. A flat line can be shown to be the optimal forecast for typical loss functions - for instance, it will minimize the expected Mean Squared Error (MSE). More specifically, the optimal forecast is the flat line given by the overall historical mean, because this one has minimum variance, better than, say, the random walk.

All your forecasts are consistent with this. Three methods (naive, mean, random walk) explicitly will give flat forecasts with different values. ARIMA could in principle give nonconstant forecasts, but it correctly decides on an ARIMA(0,0,0) model with non-zero mean, that is, the historical mean. Similarly, ETS thinks an (A,N,N) model is best: additive error, no trend, no seasonality - again a flat line.


I'd encourage you to play around a bit. Simulate lots of data like you did. Then fit different models to it, and force the models to be nonconstant (e.g., by specifying nonconstant ARIMA and ETS models). Compare MSEs on holdout samples. You should find that the simple mean should outperform the more complex models.

This is related, though in the context of real, not simulated, data: Is it unusual for MEAN to outperform ARIMA?

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357