1

I have the following python code

x = [100.0 for _ in range(1000)]
for i in range(1, len(x)):
     x[i] = x[i-1] + (2*randint(0, 1)-1)*step_size
import numpy as np
print np.corrcoef(x[:-1], x[1:])

I am getting extremely high correlation values, over 99.9%, more precisely

[[ 1.          0.99785636]
 [ 0.99785636  1.        ]]

Does anyone know what's going on? How can I get such large autocorrelation?

The Baron
  • 611
  • 1
  • 6
  • 16
  • 1
    Hint: plot your simulated data. – whuber Nov 10 '15 at 20:55
  • You're simulating autocorrelated data. What is it about the fact that the result is autocorrelated that confuses you? – gung - Reinstate Monica Nov 10 '15 at 20:59
  • I guess I don't get it why the data is autocorrelated. I am making a random move on each step. Can you please elaborate on this? I don't know what makes this data autocorrelated. What would be an example of less autocorrelated data? – The Baron Nov 10 '15 at 21:02
  • I plot the data, it looks like a straight line. Interesting, but still I lack deeper understanding as to what's going on. – The Baron Nov 10 '15 at 21:07

2 Answers2

1

Your data is autocorrelated because the value for $x_i$ depends on the value of $x_{i-1}$. That is actually the definition of autocorrelation (loosely speaking). If you want non-autocorrelated data, you'd have to change your code to

x[i] = (2*randint(0, 1)-1)*step_size

i.e., remove x[i-1].

wmay
  • 316
  • 2
  • 6
1

You are simulating a random walk. Any given period equals the previous period plus some random noise. As such, the theoretical correlation between successive periods is close to 1. Nothing unusual.

user94720
  • 11
  • 1