0

Formula of Pearson Correlation Coefficient is :

$$r_{xy}=\frac{\sum_{i=1}^{n}(x_i-\bar x)(y_i-\bar y)}{\sqrt{\sum_{i=1}^{n}(x_i-\bar x)^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar y)^2}}$$

In Time series Analysis , for lag k , why is it not :

$$r_{k}=\frac{\sum_{t=k+1}^{n}(y_t-\bar y)(y_{t-k}-\bar y)}{\sqrt{\sum_{t=k+1}^{n}(y_t-\bar y)^2}\sqrt{\sum_{t=k+1}^{n}(y_{t-k}-\bar y)^2}}$$

?

rather the actual formula is :

$$r_{k}=\frac{\sum_{t=k+1}^{n}(y_t-\bar y)(y_{t-k}-\bar y)}{\sum_{t=1}^{n}(y_t-\bar y)^2}$$

The denominator is confusing me . Why is the denominator $\sum_{t=1}^{n}(y_t-\bar y)^2$ instead of $\sqrt{\sum_{t=k+1}^{n}(y_t-\bar y)^2}\sqrt{\sum_{t=k+1}^{n}(y_{t-k}-\bar y)^2}$

time
  • 1,167
  • 5
  • 15
  • 31
  • A detailed explanation is available at http://stats.stackexchange.com/questions/81754/understanding-this-acf-output/81764#81764. – whuber Apr 17 '15 at 16:14

2 Answers2

2

In the analysis of stationary time series you assume that the expected value, say $\mu$, is the same independent of $t$ as well as the variance $\sigma^2 = Var(y_t)$ does not depend on $t$. So there is only one variance. You can look at the fist pages here.

Richi W
  • 3,216
  • 3
  • 30
  • 53
1

Actual formula is, $$r_{k}=\frac{\sum_{t=k+1}^{n}(y_t-\bar y^{(1)})(y_{t-k}-\bar y^{(2)})}{\sqrt{\sum_{t=k+1}^{n}(y_t-\bar y^{(1)})^2}\sqrt{\sum_{t=k+1}^{n}(y_{t-k}-\bar y^{(2)})^2}}$$ Where,$$\bar y^{(1)}=\frac{\sum_{t=k+1}^{n}y_t}{n-k}~~~and~~~\bar y^{(2)}=\frac{\sum_{t=k+1}^{n}y_{t-k}}{n-k}$$ So,If n is large enough, we can approximate $\bar y^{(1)}~and~\bar y^{(2)}~to~~\bar y=\frac{\sum_{t=1}^ny_t}{n}$............(law of large numbers, average of large sample moves towards population mean.)

Hence,$$r_{k}\simeq \frac{\sum_{t=k+1}^{n}(y_t-\bar y)(y_{t-k}-\bar y)}{\sqrt{\sum_{t=k+1}^{n}(y_t-\bar y)^2}\sqrt{\sum_{t=k+1}^{n}(y_{t-k}-\bar y)^2}}$$ Finally, $$r_{k}\simeq \frac{\sum_{t=k+1}^{n}(y_t-\bar y)(y_{t-k}-\bar y)}{{\sum_{t=1}^{n}(y_t-\bar y)^2}}............(same ~variance)$$

Hemant Rupani
  • 1,193
  • 9
  • 19
  • But when the $n$ is not large enough , say n< 30 , will the denominator have same variance ? – time Apr 17 '15 at 09:48
  • 1
    If time-series is stationary, then yes... – Hemant Rupani Apr 17 '15 at 09:51
  • 1
    Otherwise you might multiply r by $\frac{n}{n-k}$ – Hemant Rupani Apr 17 '15 at 09:52
  • Under stationarity , variance does not depend on $t$, but it depends on lag . It seems to me $\sum_{t=1}^{n}$ and $\sum_{t=k+1}^{n}$ are concerned in lag rather than time . I couldn't catch the point of stationary time series . Could you please explain why the variance will be same if time-series is stationary . – time Apr 17 '15 at 09:59
  • 1
    For stationary, Variance is constant.... It is Covariance that depends on lag – Hemant Rupani Apr 17 '15 at 10:02
  • if my lag is $k=2$ , then in $\sum_{t=1}^{n-k}y_{t-k}$ , the first term is $y_{-1}$ . But i haven't data of $y_{-1}$ . Rather my data starts from $y_1$ . If i put $y_{-1}=0$ still i am counting $y_{-1}$ and in the denominator i am adding 1 for this $y_{-1}$ . But why should i count $y_{-1}$ while it has no existence ? – time Apr 18 '15 at 09:28
  • 1
    are you asking about $\bar y^{(2)}$ I edited it, it was typo – Hemant Rupani Apr 18 '15 at 09:46
  • 1
    If $\bar y^{(2)}$ takes data point $y_1$ , then $\bar y^{(1)}$ takes data point $y_{k+1}$ . So i think in $\bar y^{(1)}$ , the sum over $t$ goes from $k+1$ to $n$ . – time Apr 18 '15 at 09:53
  • ohhh Yes! Now cleared. – Hemant Rupani Apr 18 '15 at 09:58
  • 1
    But what about $\bar y^{(2)}$ ? doesn't it also need to $\sum_{k+1}^{n}$ ? – time Apr 18 '15 at 10:17
  • Apologies!!! :) – Hemant Rupani Apr 18 '15 at 10:21