1

Let us assume I have a time series made of the following observations:

ts = c(163,18,53,189, 243, 101, 150, 39, 60,96,36,76,71,67,56,3,72,96,15,19)

How can I determine if it respects the Markov chain properties and how can I get the markovian order? Is there a source where I can find a guide?

Any good python or R solutions? The data I am dealing with is made of about 500K observations.

Thanks

  • 1
    Not sure what the formal way is to do this, but you could try to predict $X_t$ given (1) $X_{t-1}$ and (2) the entire vector. If the quality of your predictor for (1) is the same as (2) then you have reason to believe your time series respects the Markov property. – philbo_baggins Oct 10 '20 at 02:47
  • That's indeed a nice idea, but doing so for each of the 500k observations might be computationally challenging. Valerio, are your observations continuous or discrete ? Is your vector ts an example sampled from your 500k observations, or is this the set of possible values that your 500k observations can take ? – Camille Gontier Oct 10 '20 at 14:37
  • @philbo_baggins do you mean by using the empirical transition matrix observed from the studied time series? Do you suggest to simulate the series by using a Montecarlo Approach where the simulated are generated by the empirical transition matrix? – Valerio Ficcadenti Oct 13 '20 at 16:36
  • @CamilleGontier the observations are discrete, they come from a set of status made of natural numbers in [1-247]. I have changed the sample posted because it was not accurate. It could be a subset but I have invented it. Anyway, out of 500K observations, it is plausible to have that sampled vector. :( – Valerio Ficcadenti Oct 13 '20 at 16:38

1 Answers1

0

I think this paper should answer your question : https://www.econstor.eu/handle/10419/2673

The starting point of the procedure is indeed to estimate the transition matrix from your observations, for instance by using a Maximum Likelihood approach. Then, you have to check for these two markovian properties:

  1. You have to verify that your process is stationary. This can be done by splitting your observations into sub-chunks, estimating the transition matrices for each of these subsets, and testing whether the estimates significantly differ.
  2. You have to verify that your process is memory-less, i.e. that your estimates of the transition matrix do not vary if the values of previous states are taken into account.

These statistical tests are described in the paper (and are based on a chi-squared test).

Hope this helps !

Camille Gontier
  • 1,248
  • 3
  • 12
  • Unfortunately, i have found that this type of solution is somewhat questioned [here](https://stats.stackexchange.com/a/250519/298613) :''( @Camille Gontier – Valerio Ficcadenti Oct 20 '20 at 17:27