I am working with daily data (variables include: temperature, salinity, wind, etc...) from 2002-2013 (msts
), and I want to identify the ARIMA equation describing the whole data set, while also considering covariates unique to each variable, then use the ARIMA equation to predict each variables' values 7 days into the future from specific starting points in the data set. Thus, tbats
is not appropriate for me.
First I need to define the "order" and "seasonal" values for my Arima equation using auto.arima
. Additionally, I believe there is seasonality in the data that is weekly, monthly and yearly so I am defining multiple "seasonal periods" in auto.arima
. Starting with predicting the temperature variable values, I am using xreg
in the auto.arima
function to include specified covariates, and want to also include the fourier
/fourierf
function for the dummy variables of month and year as well as a method for determining K.
Ultimately, my R code should be written to consider long-term data (2002-2013), seasonality (3), dummy variables (Month and Year), and identify the K value for predictive purposes.
Although I have found a lot of great help online, I cannot get my more complex code to work. http://robjhyndman.com/hyndsight/longseasonality/ (and other Hyndman posts for complex seasonality, forecasting, detecting seasonality, daily data, etc...) and Time Series Forecasting with Daily Data: ARIMA with regressor
I am using these packages for various parts of the code development:
library(MASS)
library(mgcv)
library(lattice)
library(epicalc)
library(caTools)
library(forecast)
library(McSpatial)
My simple R CODE below works - although only has one seasonal period, and no fourier function or dummy variables included.
Season<- msts(PaulsData$Temperature, seasonal.periods=30)
auto.arima(Season,stepwise=TRUE,approximation=TRUE,xreg=Covariates)
# Temperature: ARIMA(4,1,0)(1,0,0)[30]
fit<-Arima(June2009EventA$Temperature,order=c(4,1,0),seasonal=c(1,0,0))
plot(forecast(fit, h=7)) # fit is not very dynamic
dev.off()
View(forecast(fit))
The complex code below has the equation components I think I need, but it does not work... The K part is where things keep getting hung up... "Error in fourier(y, K = i) : unused argument (K = i)"
. This happens anytime I use fourier
.
y<- msts(PaulsData$Temperature, seasonal.periods=c(7,30,365.25)) #Daily Temperature (w/NAs) 2002-2012
bestfit<-list(aicc=Inf) # Select K value
for(i in 1:25)
{
fit<- auto.arima(y, xreg=fourier(y, K=i), seasonal=FALSE) # need the K-value, error in fourier function - K unused arugument
if(fit$aicc<bestfit$aicc)
bestfit<-fit
else break;
}
dummyMonth<- fourier(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=30),K=bestfit) # need the K-value, error in fourier function - K unused argument
ZdummyMonth<- fourierf(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=30, h=7),K=bestfit)
dummyYear<- fourier(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=365.25),K=bestfit)
ZdummyYear<- fourierf(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=365.25, h=7),K=bestfit)
fit<- auto.arima(y,xreg=cbind(dummyMonth,dummyYear,Covariates),seasonal=FALSE)
plot(forecast(fit, h=7))
Even if I use the code below to exclude the selection of the K value, it does not work?!
y<- msts(PaulsData$Temperature, seasonal.periods=c(7,30,365.25)) #Daily Temperature (w/NAs) 2002-2012
dummyMonth<- fourier(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=30),K=5) # need the K-value, error in fourier function - K unused argument
ZdummyMonth<- fourierf(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=30, h=7),K=5)
dummyYear<- fourier(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=365.25),K=5)
ZdummyYear<- fourierf(msts(PaulsData$Temperature,seasonal.periods=cbind(7,30,365.25), ts.frequency=365.25, h=7),K=5)
fit<- auto.arima(y,xreg=cbind(dummyMonth,dummyYear,Covariates),seasonal=FALSE)
plot(forecast(fit, h=7))
Where:
Temperature = a column of values in deg C with x-rows
Month = matrix of zeros and ones with x-rows
Year = matrix of zeros and ones with x-rows
Covariates = columns of values in appropriate units (i.e. Salinity (ppt), Wind (m/s)) with x-rows
Thank you in advance for your help!