Estimation of ARMA: state space vs. alternatives

Question

I am interested in estimation of ARMA models. I understand that a popular approach is to write the model down in the state space form and then maximize the likelihood of the model using some optimization routine.

Question: Why rewrite the model into its state space representation and maximize the corresponding likelihood -- instead of maximizing the "naive" or "direct" likelihood?

(I could imagine that a different parameterization can make the optimization easier -- is that the case here?)

Related questions:

I am also aware of some general advantages and disadvantages of the state space representation as mentioned in "What are disadvantages of state-space models and Kalman Filter for time-series modelling?".

Simple reparametrizations are used in state space models in order to simplify estimation procedures as described in Durbin Koopman's and A. Harvey's books. Also in comparison to ARIMA, state space parametrization is richer as ARIMA parametrization is the reduced form of state-space and doesn't take into account the convergence to steady state. — Cagdas Ozgenc, Nov 01 '16 at 14:58
@CagdasOzgenc, oh, so then the change (improvement?) in estimation due to the state space (SS) representation is actually coming from reparameterization? That would explain some of my confusion. Could you give a very simple example of how the parameterization changes from direct likelihood to SS? E.g. in a simple AR(1) model I thought the model parameters are directly put into the SS without any changes (e.g. the slope $\varphi_1$ remains as is, it does not become some, say, $2\theta+1$ or whatever). But maybe I overlooked something. — Richard Hardy, Nov 01 '16 at 15:29
Have you looked at these lecture notes: http://www.kris-nimark.net/pdf/Handout_S4.pdf ? Even if you write the model in state space form you will need to filter through the data using (e.g.) the Kalman filter. The prediction error can then be used to construct a likelihood function which needs to be maximized. So you will need to maximize a likelihood function one way or the other. — Plissken, Nov 01 '16 at 16:01
@Plissken ...which is what I am trying to say here *write the model down in the state space form and then maximize the likelihood of the model* [due to that form] *using some optimization routine*. So I understand that likelihood maximization is inevitable. The question is, is the direct likelihood and the likelihood of the state space representation the same or different? And if they are different, is that due to reparameterization which makes one of the likelihood functions easier to maximize? — Richard Hardy, Nov 01 '16 at 16:17
Not a full answer but: 1) alternatives don't always optimize the same likelihood, it's usually a conditional likelihood on the first few observations, whereas in state-space exact diffuse initialization is available, and 2) Kalman filter provides recursion for the likelihood naturally but also a recursion for the gradient as a by-product, which is useful for optimisation. — Chris Haug, Nov 01 '16 at 16:24
@ChrisHaug, it is possible to obtain exact/full MLE by specifying marginal densities for the initial values and including these in the likelihood function although these two are asymptotically equivalent. — Plissken, Nov 01 '16 at 16:37
@ChrisHaug, as long as the likelihood is parameterized in the same way, treatment of the initial value seems of minor importance, unless in small samples. But is the likelihood function the same? And if not, it must be due to different parameterization (since the model is still the same, just the parameterization may differ). And that would make me revise my impression than the parameterization is the same in the direct vs. the state space likelihood function. — Richard Hardy, Nov 01 '16 at 16:45
@RichardHardy, interesting question which I am gonna read up on. According to this article: http://www.lse.ac.uk/statistics/documents/researchreport40.pdf their parametrisation has computational advantages but these are related to the Kalman filter recursions. Note, that since the state space approach and the direct MLE give the same (or very similar) parameter estimates the likelihood function should be the same. If it wasn't we wouldn't get the same estimates. — Plissken, Nov 01 '16 at 17:02
@Plissken, thanks. I read the lecture note. It requires some effort, just like many other lecture notes on state space modelling that I have read earlier, and I am not getting all the insight I am wishing for. I was hoping someone here could give me the intuition instead. Now regarding direct likelihood vs. state space, of course the estimates should be the same -- because we are talking about two equivalent representations of a single model (the same model). Differences in estimates could results from poor optimization (getting stuck in local optima, etc.), but hopefully we can avoid that. — Richard Hardy, Nov 01 '16 at 17:08

Estimation of ARMA: state space vs. alternatives

0 Answers0

Linked