I have two time series to work with, let's say $X_1$ and $X_2$.
First I have to estimate the best pure ARMA model for $X_1$; which is no problem. For that I perform the following steps:
- Stationarize (if needed) the time series by differencing
- Determine whether AR/MA terms are needed to correct any autocorrelation in the differenced series -> tentatively identify the maximum number of AR and/or MA terms using ACF and PACF plots
- Then estimate different models, store BIC values, construct a matrix of BIC values, and select the ARMA model with the lowest BIC value.
Now comes the part that confuses me. I have to estimate an ARMAX model for $X_1$, where I need to put $X_2$ as a lagged explanatory variable in the ARMAX model.
- I don't think that including the exogenous variable will change the number of AR and MA terms for the best model relative to model estimated before. If e.g. ARMA(2,1) has the lowest BIC relative to the other ARMA models, ARMAX(2,1) will also have the lowest BIC relative to the other ARMAX models. Is that true?
- What is meant by including the 'lagged' explanatory variable? Does that mean the one-period lagged values of $X_2$? Or do I have to find the optimum number of lags for $X_2$ using a statistical technique?