2

MA model can for example take the form:

$$Y_t = \beta\epsilon_{t-1} + \epsilon_t.$$

Now, to estimate $\beta$ we need to find out $\epsilon_t$. That can be found from the AR($\infty)$ representation:

$$\epsilon_t = Y_t+ \sum_{i=1}^k(-\beta^i)Y_{t-i}.$$

Ideally $k$ would be infinite, but that is not possible so some $k$ needs to be selected that can be supported by the data. Using least squares, the equation we have is:

$$\operatorname{minimize }\left( \sum_t \left(Y_t+\sum_{i=1}^k(-\beta^i)Y_{t-i}\right)^2\right).$$

So we have an AR model, with non-linear parameters.

Question: As we know, in estimation of AR models, the data used in the estimation process has to be reduced depending on the amount of lags. Thus, for MA model, does the estimation result not highly depend on what k is selected? For large $k$ we have better statistical properties, but less data for the estimation... is this correct? If it is, why do most statistical packages not show what $k$ was selected?

EDIT: Since there seems to be some confusion regarding to what is being asked. Look at the least squares minimization problem for the MA coefficient. Now assume $k=1$, this would be one way to estimate the coefficient. But now we have a AR(1) model, not very good estimation... Next assume $k = N-1$. Now we are getting closer to the true MA(1) model... But there is only one point of data for estimation since all the lags eat the data. It seems that some choice has to be made between these two extremes, making the estimation non-unique. Unless I am wrong and we can max out $k$ and not pay a penalty (in which case there would be a unique solution).

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Dole
  • 743
  • 3
  • 10
  • 22
  • You are only considering first order moving average time series. There are higher order moving average time series models that can be considered. Each model estimates/predicts differently. – Michael R. Chernick Apr 03 '17 at 21:28
  • @MichaelChernick Yes, I used it as an example just for simplicity. But the same result applies generally for finite order MA models (=all of them). – Dole Apr 03 '17 at 21:31
  • What result is the same? – Michael R. Chernick Apr 03 '17 at 21:33
  • @MichaelChernick The result that the estimation of coefficients depends on k and as such is not unique. – Dole Apr 03 '17 at 21:35
  • There are methods to determine the order of the model by using the autocorrelation and partial autocorrelation functions. – Michael R. Chernick Apr 03 '17 at 21:37
  • @MichaelChernick Yes, but that's not related to this question. The question is about the estimation of the MA coefficient(s). – Dole Apr 03 '17 at 21:38
  • I think it is related because the question mentions uniqueness. – Michael R. Chernick Apr 03 '17 at 21:41
  • Ok. I'm quite confused what you're asking. (1) What do you precisely mean by "estimation result." (2) What are two, precisely defined situations where you'd like to know if the "estimation result" is the same? – Matthew Gunn Apr 03 '17 at 21:49
  • Why do you consider estimation via the AR($\infty$) (or AR($k$)) representation? It is not how MA models are typically estimated, or is it? – Richard Hardy Apr 04 '17 at 06:32

1 Answers1

5

Thus, for MA model, does the estimation result not highly depend on what $k$ is selected? For large $k$ we have better statistical properties, but less data for the estimation... is this correct?

Yes, it is correct that different $k$ could be selected, yielding different estimates of the errors and subsequently different estimates of the MA parameter(s).
(Instead of saying "have better statistical properties, but less data for the estimation" I would say "have lower model error but higher estimation error" or "have lower bias but higher variance".)

If it is, why do most statistical packages not show what $k$ was selected?

It is because this is not the way MA($q$) models are typically estimated in practice. For example, in R function arima, the exact likelihood is computed via a state-space representation of the ARIMA process, and the innovations and their variance found by a Kalman filter.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • Thank you! Is there a source where the estimation process is described in more detail? – Dole Apr 04 '17 at 06:48
  • 1
    @Dole, I do not remember any good source, so I would google it, but then you can do it yourself. When it comes to MA or ARMA (rather than pure AR) models, estimation seems pretty complicated... I was trying to grasp it better myself (see [this](http://stats.stackexchange.com/questions/243564/estimation-of-arma-state-space-vs-alternatives) question), but did not get very far due to time limitations I faced. You will find some nice links there, though (also in the comments). – Richard Hardy Apr 04 '17 at 06:51
  • @Dole, I edited my comment above. – Richard Hardy Apr 04 '17 at 06:54