Confused by extremely high autocorrelation

Question

I have the following python code

x = [100.0 for _ in range(1000)]
for i in range(1, len(x)):
     x[i] = x[i-1] + (2*randint(0, 1)-1)*step_size
import numpy as np
print np.corrcoef(x[:-1], x[1:])

I am getting extremely high correlation values, over 99.9%, more precisely

[[ 1.          0.99785636]
 [ 0.99785636  1.        ]]

Does anyone know what's going on? How can I get such large autocorrelation?

You're simulating autocorrelated data. What is it about the fact that the result is autocorrelated that confuses you? — gung - Reinstate Monica, Nov 10 '15 at 20:59
I guess I don't get it why the data is autocorrelated. I am making a random move on each step. Can you please elaborate on this? I don't know what makes this data autocorrelated. What would be an example of less autocorrelated data? — The Baron, Nov 10 '15 at 21:02
I plot the data, it looks like a straight line. Interesting, but still I lack deeper understanding as to what's going on. — The Baron, Nov 10 '15 at 21:07

score 1 · Accepted Answer · answered Nov 11 '15 at 01:22

Your data is autocorrelated because the value for $x_i$ depends on the value of $x_{i-1}$. That is actually the definition of autocorrelation (loosely speaking). If you want non-autocorrelated data, you'd have to change your code to

x[i] = (2*randint(0, 1)-1)*step_size

i.e., remove x[i-1].

score 1 · Answer 2 · answered Nov 11 '15 at 01:23

1

You are simulating a random walk. Any given period equals the previous period plus some random noise. As such, the theoretical correlation between successive periods is close to 1. Nothing unusual.

answered Nov 11 '15 at 01:23

user94720

11
1

Confused by extremely high autocorrelation

2 Answers2

Linked