3

There are many ARMA(p,q) processes, with different parameter values, associated with the same auto-correlation function. If this processes are also Gaussian, this implies they follow same exact distribution.

So, for a set of data points, there are usually multiple parameter values associated with the same (maximum) likelihood. Only one of this processes is also invertible, we choose that particular estimate restricting the parameter space to that of invertible processes (making the model identifiable).

If the processes are not Gaussian, it is still true that there are multiple processes with different parameters all having the same auto-correlation function, but, I don't think this implies they follow the same distribution as before.

So, for a set of data points, I wouldn't expect to have multiple points in the parameter space with the same maximum likelihood. Is this correct?

Restricting the parameter space was useful with Gaussian processes since it was a way of choosing the invertible representation of the process among all those having the same maximum likelihood. Now, with non-Gaussian processes, the possibly unique maximum likelihood estimate could be an invertible process or not. If it isn't, we can't actually choose an invertible representation with the same maximum likelihood.

In this case, is it a problem to have a process that isn't invertible?

Marco Rudelli
  • 550
  • 1
  • 11

1 Answers1

2

Yes, for non-Gaussian white noise, the distribution of the data may be different for an invertible model and its non-invertible counterpart(s). For example, consider the invertible MA(1) model $$ y_t = w_t + \frac12 w_{t-1} \tag{1} $$ where the white noise process $w_t \sim U(-1,1)$ and the non-invertible model $$ y_t = w_t + 2 w_{t-1} \tag{1} $$ where $w_t \sim U(-1/2,1/2)$. Although these processes have the same autocovariance functions the joint distribution of for instance $y_1,y_2$ are different as demonstrated by the following simulation.

n <- 1e+4
par(mfrow=c(1,2))
# Invertible version
s <- .5
w0 <- runif(n,-s,s)
w1 <- runif(n,-s,s)
w2 <- runif(n,-s,s)
y1 <- w1 + .5*w0
y2 <- w2 + .5*w1
plot(y1,y2,pch=".")
# Non-invertible counterpart
s <- .25
w0 <- runif(n,-s,s)
w1 <- runif(n,-s,s)
w2 <- runif(n,-s,s)
y1 <- w1 + 2*w0
y2 <- w2 + 2*w1
plot(y1,y2,pch=".")

Given enough data, you should thus in principle be able to distinguish between models such as (1) and (2) and relying on (2) e.g. for prediction would of course be perfectly fine. But the exact likelihood seems rather hard to compute. Also, depending on the exact distribution of the white noise, the difference between the resulting joint distributions of the data may be much less obvious than in this toy example. But, yes, this sort of thing is something people have looked into.

Jarle Tufto
  • 7,989
  • 1
  • 20
  • 36
  • So is it a problem to have a process that isn't invertible? – Richard Hardy Dec 22 '21 at 09:22
  • @RichardHardy No, such non-Guassian, non-invertible models seems to be in wide use if you look into the literature. – Jarle Tufto Dec 22 '21 at 09:45
  • Thank you, that was helpful! – Richard Hardy Dec 22 '21 at 10:23
  • @JarleTufto thank you! this example shows really well what I was kind of thinking about when writing the question. So, would you say that invertibility is necessary only for Gaussian processes, with the main goal of assuring identifiability? – Marco Rudelli Dec 24 '21 at 04:50
  • @JarleTufto Would you also kindly take a look at this related question I wrote later? (https://stats.stackexchange.com/q/557948/318559) the book by Box and Jenkins seems to make a big deal about invertibility and I don’t get why... – Marco Rudelli Dec 24 '21 at 04:56