I am estimating several ARIMA(p,1,q) for the logarithm of the realised volatility of the S&P 500, where d = 1 based on KPSS test, even though presence of an unit-root is rejected by the ADF test. As this is a hint of long memory, I am also estimating an Heterogeneous Autoregressive (HAR) model (described in this paper) for the logarithm of the realised volatility. An HAR model can be seen as a AR model with restrictions on certain lags.
I would like to compare the goodness-of-fit of the ARIMA(p,1,q) with the HAR model of the log-volatility. Since the ARIMA models are estimated for the first difference of the log-volatility while the HAR model is estimated for the log-volatility, I am assuming I cannot use any Information Criterion since the log-likelood functions are different.
I was thinking as an alternative goodness-of-fit measure 1) computing and comparing the standardised RSS for both models, 2) undifference (how?) the ARIMA residuals and computing its RSS or 3) difference the HAR residualS and compute its RSS.
A completely different alternative would be to estimate the HAR model for the first difference of the log-volatility - in this case I could compare the AIC of both models - but then I don't stick to the HAR model in the paper and I cannot give the same economic interpretations to its coefficients.
Any hint is appreciated. Thanks.