I have a question about the use of the bsts
package. In general my question is if my approach is feasible. Because my holdout MAPE is much worse than all the other approaches I have in my ensemble.
Here is my code.
library("bsts")
library("ggplot2")
library("reshape")
# split into test and train ------------------------------------------------------
date <- as.Date("2017-06-04")
horizon <- 105
model.data$DATUM <- as.Date(model.data$DATUM)
xtrain <- model.data[model.data$DATUM <= date,]
xtest <- model.data[model.data$DATUM > date,]
# building the first model ------------------------------------------------------
ss <- list()
ss <- AddSemilocalLinearTrend(ss, xtrain$ITEMS)
ss <- AddSeasonal(ss,xtrain$ITEMS,nseasons = 52,
season.duration = 7)
# V7 is a dummy variable for the one outlier
fit <- bsts(ITEMS ~ V7 ,
data = xtrain,
seed = 100,
state.specification = ss,
niter = 1500)
# validation --------------------------------------------------------------------
burn <- SuggestBurn(0.1,fit)
fcast.holdout <- predict(fit,
newdata = xtest,
h = horizon,
burn = burn)
validation.time <- data.frame("semi.local.linear.bsts" = as.numeric(fcast.holdout$mean),
"actual" = model.data[model.data$DATUM > date,"ITEMS"],
"datum" = model.data[model.data$DATUM > date,"DATUM"])
a <- melt(validation.time,id.vars = c("datum"))
ggplot(data = a,
aes(x = datum, y = value, group = variable,color = variable))+
geom_point()+
geom_line()
plot(fcast.holdout)
The data can be found here. The data are daily sales data for a retail shop. Later I want to include some dummy variables which you can also find in the example data.
For me the main questions are:
Is the seasonal part correctly defined? I have a annual seasonality in my data and also a weekly pattern. However in the validation plot I cannot find the weekly pattern.
Why do I have such high prediction intervals? Should I change the trend part?