I know I'm trekking down a well beaten path with this type of question, but I find myself trying to clarify how to combine several snippets on the internet and coming up empty handed. There is one question very similar here, but it does not have an answer.
My situation is very similar to the linked question. That is:
- I'm working in python so
rugarch
and similar libraries are off the table - I'd like to combine the outputs of an
ARMA
+GARCH
model to make an estimate + CI
Most of the tutorials I see online in python strike me as misguided, because they are misspecifying various things.
Here's some sample code to get an example working:
# imports
import pandas as pd
import yfinance as yf
import numpy as np
import pmdarima
import arch
# download data
ticker = yf.Ticker('^GSPC')
data = ticker.history(period = 'max')
vals = np.log(data.iloc[:, 3]).diff()
vals.iloc[0] = 0
# fit ARIMA and GARCH models
arima = pmdarima.auto_arima(X)
residuals = arima.arima_res_.resid * 100
garch = arch.arch_model(residuals, p = 1, q = 1, dist = 'ged')
garch_fit = garch.fit()
garch_for = garch_fit.forecast(horizon = 1)
mean = arima.predict(n_periods = 1)[0]
I am familiar with Richard Hardy's advice to git ARMA/GARCH simultaneously, but I am omitting that step for now.
Now, with these fitted models in hand, I continually run into conflicting information.
The first is how you combine the outputs of both into a single prediction. What I frequently see online (in the python ecosystem, using the above libraries), is you take the ARMA
prediction (the mean
variable in this case), and then you add it to the predicted mean from GARCH
.
So in this case it would look something like this:
# ARMA prediction + GARCH mean prediction for next time step, divided by 100 to scale
mean + forecast.mean['h.1'].iloc[-1] / 100
This has to be wrong, right? For one, GARCH models are built on the assumption of a constant mean, and so this value is always the same, so I don't see what effect it would have. I'd think it'd have to be adding the ARMA
term + forecasted variance. In this case it would look like:
# ARMA prediction + GARCH mean prediction for next time step, divided by 100 to scale
mean + forecast.variance['h.1'].iloc[-1] / 100
And the second is that it strikes me as odd that you would add this value and not subtract it as well. Variance does not have a particular direction, so wouldn't you need to both add and subtract to get the range of values?
Likewise, if we want a true confidence interval, shouldn't we take the standard deviation of the variance?
So something like sd = np.sqrt(forecast.variance['h.1'])
and then the confidence interval is mean +/- 1.96 * sd
, or something like that?
I know there are a lot of questions here on this very topic, but I don't see any of them specified in this way.