Difference between autocorrelation and partial autocorrelation

Question

I have read some articles about partial autocorrelation of time series and I have to admit, that I do not really comprehend the difference to a normal autocorrelation. It is often stated that the partial autocorrelation between $y_t$ and $y_t-k$ is the correclation between $y_t$ and $y_t-k$ with the influence of the variables between $y_t$ and $y_t-k$ removed? I do not understand this. If we calculate the correlation between $y_t$ and $y_t-k$ then anyways the variables in between are not consindered at all if you use the correlation coefficient for doing that. The correlation coefficient considers two variables only as far as I know.

This really confuses me. I hope you can help me on that. I'd appreciate every comment and would be thankful for your help.

Update: Can anyone try to explain how one could calculate autocorrelation and partial autocorrelation for a time series. I understood how to do this with a sample but not with a time series (because you need three variables according to the example here https://en.wikipedia.org/wiki/Partial_correlation). Do you know any example where this is done?

Check this question/answer: https://stats.stackexchange.com/questions/472426/help-understanding-autocorrelation-and-partial-autorcorrelation-pacf/472429#472429 — Ale, Aug 17 '20 at 14:32
Thanks Ale for the comment and the link. Unfortunately it does not help. I still have a fundamental comprehension problem with partial autocorrelation. Further I do not understand the specific answer given in that post. — PeterBe, Aug 17 '20 at 16:30

score 13 · Accepted Answer · answered Oct 11 '20 at 13:14

13

For a while forget about time stamps. Consider three variables: $X, Y, Z$.

Let's say $Z$ has a direct influence on the variable $X$. You can think of $Z$ as some economic parameter in US which is influencing some other economic parameter $X$ of China.

Now it may be that a parameter $Y$ (some parameter in England) is also directly influenced by $Z$. But there is an independent relationship between $X$ and $Y$ as well. By independence here I mean that this relationship is independent from $Z$.

So you see when $Z$ changes, $X$ changes because of the direct relationship between $X$ and $Z$, and also because $Z$ changes $Y$ which in turn changes $X$. So $X$ changes because of two reasons.

Now read this with $Z=y_{t-h}, \ \ Y=y_{t-h+\tau}$ and $X=y_t$ (where $h>\tau$).

Autocorrelation between $X$ and $Z$ will take into account all changes in $X$ whether coming from $Z$ directly or through $Y$.

Partial autocorrelation removes the indirect impact of $Z$ on $X$ coming through $Y$.

How it is done? That is explained in the other answer given to your question.

answered Oct 11 '20 at 13:14

Dayne

2,113
1
7
24

Thanks Dayne for your answer. Now I understand the basic idea behind partial autocorrelation due to you great explanation. However, I still do not know how I can calculate the PACF? For normal autocorrelation I would just use the Pearson correlation coefficient. Is there a similar formular or (partial)correlation coefficient for the PACF? As stated before, I do not like the other answer given above and it is not useful for me (too many unexplained variables and statements) – PeterBe Oct 12 '20 at 07:57
Altough my question has still not been fully answered, I awarded the bounty to you because of your great explanation and at least I understand the concept behind it now. – PeterBe Oct 12 '20 at 07:58
Great that you liked the answer. About the calculation part: the math of it is obviously complicated and requires a lot of qualifiers, particularly related to stationarity. Let me try to give geometric interpretation in next comment. – Dayne Oct 12 '20 at 11:42
Consider three vectors X, Y, Z in an N dimensional space. Each observation of X form one dimension. The length of each vector represent standard deviation. The dot product is the covariance and cosine of the angle between two vectors is the correlation. Now if you remember, dot product projects one vector onto another. So dot product can be used to break a vector in two parts. One part is parallel to the vector with with dot product is taken and another part is orthogonal/perpendicular to it. So if you subtract from X, projection of X on Y, it becomes perpendicular to Y. – Dayne Oct 12 '20 at 11:53
Do same with Z by projecting Z on Y and get the subtracting it from Z to get part of Z perpendicular to Y. The angle between these new X and Z is the partial correlation. – Dayne Oct 12 '20 at 11:54
Thanks Dayne for your explanation. I have to admit that unfortunately I do not understand it. Do you know any website that explains how to calculate the PACF in a good way? – PeterBe Oct 12 '20 at 12:45
Does anyone know a website that comprehensively explains how to calculate the PACF. I'd really appreciate it as I am struggeling on this one. – PeterBe Oct 13 '20 at 16:43
@PeterBe : may I suggest first focussing on partial correlation (instead of autocorrelation)? The reason is when it comes to time series, any good text will first go into details of stationarity which can be rather confusing, if the objective is to only learn about concept. Partial correlation is the same concept applied to time series under relevant assumptions. You can try first [wiki page](https://en.wikipedia.org/wiki/Partial_correlation). It has several ways methods and also geometric interpretation that I gave above. – Dayne Oct 13 '20 at 17:00
Thanks Dayne for your comment and effort. I really appreciate it. Yes you are right, stationary is extremely confusing for me (I have asked a question https://stats.stackexchange.com/questions/491785/can-stationary-time-series-contain-regulary-cycles-and-periods-with-different-fl , and there are surely to come more). Did I understand correctly that stationary has the same concept as partial autocorrelation? For me they look different. – PeterBe Oct 14 '20 at 07:22
I meant conceptiually partial correlation is same as partial autocorrelation. Stationarity is a different monster altogether. Studying partial correlation doesn't require stationarity, so will be easier to understand. – Dayne Oct 14 '20 at 07:26
Thanks Dayne for your comment. I read about partial correlation and how to compute it using linear regression and calculating the residuals. However, I do not see how I can transfer this to PACF of time series. For the partial correlation you need three variables X,Y and Z. In a univariate time series we only have 1 variable X and the second variable are just past values of X. 2 questions arise for me: 1) What should be choosen as the third controlling variable Z in an univariate time series? 2) Which value should the lag d have for X to calculate the autocorrleation between X(t) and X(t-d)? – PeterBe Oct 14 '20 at 07:46
1. We do not have one random variable in a time series. We consider each $X_t$ as a separate random variable because they can each have different distribution. Now the problem is if so, for a given *realized* time series we have only one observation so as such correlation between $X_t$ and $X_{t-d}$ cannot be computed. This is exactly where stationarity comes into play. Under stationarity, the autocorrelation between $X_t$ and $X_{t-d}$ will depend only on $d$. – Dayne Oct 14 '20 at 08:10
So now the set of all observations with gap $d$ in their time stamps becomes our sample to calculate autocorrelation. This is why autocorrelation is a function of this gap, i.e., $d$. The third controlling variable is all the $X_ts$ in between $X_t$ and $X_{t-d}$. Hope this clarifies a few things. – Dayne Oct 14 '20 at 08:11
Thanks Dayne for your answer again. I thought that for stationaryity the variance and autocorrleation are equal for the whole time series. So I misunderstood something? Further, how can we calculate the autocorrelation of those points. Let's say d is 10. Then we have 10 datapoints. As stated in the Wiki article that you gave me the link to, you have to calculate two linear regression model. What are the 3 variables? I do not understand your answer. So for me we only have X_t as the one variable. – PeterBe Oct 14 '20 at 08:19
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/115085/discussion-between-dayne-and-peterbe). – Dayne Oct 14 '20 at 08:43
Thank Dayne for your comments in the chat: What do you mean by "We do regression of Y=A+B+C+D....+L"? In the wiki it is stated that we have to do a X-->Z and a Y-->Z regression (with X,Y,Z being the variables and Z being the controlled variable). So why do we not just do a A-->Y and B-->Y regression and then calculate the correlation between the errors of the two regressions? – PeterBe Oct 15 '20 at 08:10
Hi Dayne, I carefully read all your comments in the chat (and spent a huge amount of time doing that) and I have further questions to some specific statements from you that you can see in the chat. I'd be really happy if you could answer or comment on them. – PeterBe Oct 16 '20 at 07:14
- Your wrote (Wed 10:49): "All these become my controlling variables in the regression the last regressor is X1 to X90 the coefficient of which gives us partial autocorrelation" --> Why does the last regressor give us the partial autocorrelation and which lag in the PACF does this calculate. Maybe lag 10? -You wrote (Wed 10:59): "Now if we do another regression in which dependent variable is X11 to X100 and many independent variables such as X1-X90; X2 to X91; X3 to X92;....X10-X99" --> How many independent variables do we need and what does this number depend on? – PeterBe Oct 19 '20 at 07:02
I'd highly appreciate any further comments as I am struggeling on this one. I also rewarded you with the bounty. So I think you it is fair to continue helping me. – PeterBe Oct 20 '20 at 08:52
Would you (or someone else) mind telling me how to calculate the PACF of time series a show me an example? – PeterBe Oct 20 '20 at 16:59
Maybe we can continue the dissussion in chat? – PeterBe Oct 21 '20 at 10:26
Hi Dayne. It is me again. I had a closer look at your explanation of partial correlation with the variables X,Y and Z given in your answer and I have to admit that I do not understand it. You said that X and Y are independent. Further you state that if Z changes X changes because of the direct effect of Z (which I agree upon) and because Y changes. But you said that X and Y are independent. So why should X change when Y changes? – PeterBe Dec 11 '20 at 10:32
2

@PeterBe: i think my language was a bit confusing. I meant that there is a relationship between X and Y which is independent of the relationship between Y and Z and X and Z. Hope it makes sense now. – Dayne Dec 11 '20 at 11:57
Ah okay. Now it makes sense. Thanks for your fast response. I really appreciate it. – PeterBe Dec 11 '20 at 13:17

Michael · Answer 2 · 2020-08-19T06:45:47.337

7

The difference between (sample) ACF and PACF is easy to see from the linear regression perspective.

To get the sample ACF $\hat{\gamma}_h$ at lag $h$, you fit the linear regression model $$ y_t = \alpha + \beta y_{t-h} + u_t $$ and the resulting $\hat{\beta}$ is $\hat{\gamma}_h$. Because of (weak) stationarity, the estimate $\hat{\beta}$ is the sample correlation between $y_t$ and $y_{t-h}$. (There are some trivial differences between how sample moments are computed between time series and linear regression contexts, but they are negligible when sample size is large.)

To get the sample PACF $\hat{\rho}_h$ at lag $h$, you fit the linear regression model $$ y_t = \alpha + \, ? y_{t-1} + \cdots + \, ? y_{t-h + 1} + \beta y_{t-h} + u_t $$ and the resulting $\hat{\beta}$ is $\hat{\rho}_h$. So $\hat{\rho}_h$ is the "correlation between $y_t$ and $y_{t-h}$ after controlling for the intermediate elements."

The same discussion applies verbatim to the difference between population ACF and PACF. Just replace sample regressions by population regressions. For a stationary AR(p) process, you'll find the PACF to be zero for lags $h > p$. This is not surprising. The process is specified by a linear regression. $$ y_t = \phi_0 + \phi_1 y_{t-1} + \cdots \phi_p y_{t-p} + \epsilon_t $$
If you add an regressor (say $y_{t-p-1}$) on the right-hand side that is uncorrelated with the error term $\epsilon_t$, the resulting coefficient (the PACF at lag $p+1$ in this case) would be zero.

edited Aug 19 '20 at 06:45

answered Aug 18 '20 at 03:30

Michael

2,853
10
15

2

Thanks Michael for your answer. Unfortunately there are many thing that I do not understand: 1) What do you mean by 'the resulting Beta is y^_h in the ACF formular? 2) What does '?' mean in the PACF formular 3) PACF formular: how and why do we do the "controlling for the intermediate elements"? 4) Why shall one add an regressor on the right-hand side of the last equation. 5) What is the difference between population and sample regression? I'd appreciate further comments as I am still quite confused and do not undestand the difference between autocorrelation and partial autorcorrelation – PeterBe Aug 18 '20 at 08:29
Would you mind answering my follow up questions because I honestly did not understand your answer. I'd really appreciate it and it would help me a lot. – PeterBe Aug 19 '20 at 08:04
??????????????????????? – PeterBe Aug 20 '20 at 08:09
3

Thanks for the answer Michael. I think the difference is not explained in a good way in your answer from a conceptual point of view for someone who has no experience with these terms. Because of this it is not useful for me. Still I appreciate your effort. – PeterBe Sep 28 '20 at 09:34
If someone else than Michael can answer the questions from PeterBe, I'd appreciate it as well. – FLonLon Feb 02 '22 at 10:13

Difference between autocorrelation and partial autocorrelation

2 Answers2