How to understand embedding dimension in permutation entropy when checking predictabilty in electric load data?

Question

I am predicting electric load data with different deep learning models and I am trying to define the predictability of the data. So far I came across the permutation entropy (PE) as a measurement for the complexity of the data with a normalized PE indicating deterministic data (fully predictable) with 0 and random noise with a coefficient of 1 (hardly predictable). I found several packages implementing these measurements (pyEntropy, Permutation-Entropy and ordpy) They all require the time series, an embedding dimension and an embedding delay as input. The time series is understandable. The embedding delay is the gap between subsequent windows of the time series. (In my case I use 24 as the embedding delay.)

However, I do not really understand the intuition behind embedding dimension and for what it is needed when calculating the permutation entropy. Many references cite Bandt and Pompe 2002, who recommend that the embedding dimension should lie between 3 and 7. From a very illustrative example on how to use permutation entropy to determine predictability here I understand that the embedding dimension is some sort of sample size from which permutations are created and counted. But I still do not really understand the concept behind this.

Why is there a limitation on the recommended number of example rows permutations are calculated from? How can I find the optimal embedding dimension to calculate permutation entropy for my time series?

I am not very familiar with this topic so any explanation as to whether permutation entropy is a good choice in this context or why the procedure chosen is a wrong approach would also be very much appreciated.

How to understand embedding dimension in permutation entropy when checking predictabilty in electric load data?

0 Answers0