I have noticed that SARIMAX model in statsmodels does not produce the expected (correct) fittedvalues when the model is specified as an ARMA. Below is an example showing the discrepancy between what I expected and the value fitted by the SARIMAX model.
Code:
import pandas as pd
import statsmodels.api as sm
def sarimax_model():
index = pd.period_range(start='2000', periods=4, freq='A')
original_observations = pd.Series([1.2, 1.5, 1.0, 0.8], index=index)
mod = sm.tsa.SARIMAX(original_observations, order=(1, 0, 1))
res = mod.fit()
print("Input data:\n", original_observations)
print("Model parameters:\n", res.params, "\n")
print("Model residuals:\n", res.resid, "\n")
print("Fitted values:\n", res.fittedvalues, "\n")
# Expected value for 2001
# val_2001 = 0.948959 * 1.200000 + (-0.044637) * 1.200000
val_2001 = res.arparams*res.data.endog[0] + res.maparams*res.resid[0]
print("Expected fitted values for 2001:", "\n", val_2001, "\n")
if __name__ == '__main__':
sarimax_model()
Output:
Model parameters:
ar.L1 0.948959
ma.L1 -0.044637
sigma2 0.121073
dtype: float64
Model residuals:
2000 1.200000
2001 0.367058
2002 -0.407083
2003 -0.167130
Freq: A-DEC, dtype: float64
Fitted values:
2000 0.000000
2001 1.132942
2002 1.407083
2003 0.967130
Freq: A-DEC, dtype: float64
Expected fitted values for 2001:
[1.08518626]
I wonder if I am missing something here, or the SARIMAX model is simply incorrect. The SARIMAX model produces the correct answer when it is constructed as an AR.
Glad to join this community.
Solo :)