auto.arima()
is by far a wrap of arima()
, hence will act as such in several directions.
In presence of covariates entered through xreg
, the arima()
Rd states "...If am xreg term is included, a linear regression (with a constant term if include.mean is true and there is no differencing) is fitted with an ARMA model for the error term...". Seems like it fits an LM on the x's, imposing an ARMA structure on the errors. This approach differs from that conveyed by the ARMAX framework, hence auto.arima()
won't work.
To fit an ARMAX(p, q) or any sub-class of it, you may want to try with vector generalized linear models (VGLMs) applied to time series, in R
, part of my PhD.
Particularly, my family function ARXff()
estimates ARXs as that one above,
$Y_t - \theta Y_{t - 1} = \beta X_t + \varepsilon_t$ (...[1]), by MLE using Fisher
scoring. The following gives an example of such, assuming normal errors:
set.seed(201802)
nn <- 140
x2 <- rnorm(nn)
y <- numeric(nn); y[1] <- 0
theta <- 0.25
beta <- 1.5
for (ii in 2:nn)
y[ii] <- theta * y[ii - 1] + beta * x2[ii] + rnorm(1)
# Remove warming - up values.
ts.data <- data.frame(y = y[-c(1:100)], x2 = x2[-c(1:100)])
# Modelling function: vglm(); Family function: ARXff()
fit1 <- vglm(y ~ x2, ARXff(order = 1, zero = c("coeff", "Var"),
type.EIM = "exact"),
data = ts.data, trace = TRUE)
VGLM linear loop 1 : loglikelihood = -67.168821
VGLM linear loop 2 : loglikelihood = -61.822737
VGLM linear loop 3 : loglikelihood = -60.963229
VGLM linear loop 4 : loglikelihood = -60.943982
VGLM linear loop 5 : loglikelihood = -60.943972
VGLM linear loop 6 : loglikelihood = -60.943972
Checks on stationarity / invertibility successfully performed.
No roots lying inside the unit circle.
Further details within the 'summary' output.
> coef(fit1, matrix = TRUE)
ARdrift1 loge(noiseVar1) ARcoeff11
(Intercept) -0.086297 0.20932 0.26196
x2 1.339758 0.00000 0.00000
*** This is what arima()
returns:
with(ts.data, arima(y, order =c(1, 0, 0), xreg = x2))
Call:
arima(x = y, order = c(1, 0, 0), xreg = x2)
Coefficients:
ar1 intercept x2
0.386 -0.174 1.292
s.e. 0.170 0.290 0.187
sigma^2 estimated as 1.3: log likelihood = -62.02, aic = 132.05
Given the normality assumption,fit1
is the same as fitting a normal distribution with
mean conditional on $x_2$ and $Y_{t - 1}$, similar to [1] above.
To see this, use the family uninormal()
, from the VGAM package, as follows:
ts.data <- transform(ts.data, ARcoeff = WN.lags(cbind(y), lags = 1))
> fit2 <- vglm(y ~ x2 + ARcoeff, uninormal(var.arg = TRUE),
data = ts.data, trace = TRUE)
VGLM linear loop 1 : loglikelihood = -69.732118
VGLM linear loop 2 : loglikelihood = -62.620483
VGLM linear loop 3 : loglikelihood = -61.012317
VGLM linear loop 4 : loglikelihood = -60.944089
VGLM linear loop 5 : loglikelihood = -60.943972
VGLM linear loop 6 : loglikelihood = -60.943972
> coef(fit2, matrix = TRUE)
mean loge(sd)
(Intercept) -0.086297 0.10466
x2 1.339758 0.00000
ARcoeff 0.261963 0.00000
But ARXff()
is even broader, e.g., you can model $\theta$ using
VGLM-link functions, such as logit()
, if needed. Also, I have implemented ARXff()
to work with the exact expected information matrices (See type.EIM = "exact"
) for the ARX model.
However, the covariate effects on the ARXs as above may not be simple to interpret, e.g, as with ordinary LMs. This is the downside of ARX models. For the ease of interpretation, that regression model with ARMA errors arises probably as the most convenient choice. This is how arima()
seems to work in presence of covariates. BTW. I have too implemented this approach through my family function ARIMAX.errors.ff()
.
ARMAXff()
and ARIMAXff()
have also been implemented accordingly, so you can also fit ARMAX models, similar to fit1
. But for regression models with ARMAX errors, ARIMAX.errors.ff()
is the choice. These are incorporated in my package, called VGAMextra, an extension of VGAM in a few directions, including time series analysis. At the moment, VGAMextra concetrates on modelling and estimation. I will incorporate, e.g., automatic forecasting over time.