14

It may be a weird question at all but as a novice to the subject I am wondering why do we use regression to detrend a time series if one of the regression's assumption is the data should i.i.d. while the data on which regression is being applied is a non i.i.d?

smci
  • 1,456
  • 1
  • 13
  • 20
FarrukhJ
  • 161
  • 1
  • 8
  • 6
    It is not generally true that we make the assumption that the "data" is i.i.d. – Christoph Hanck Feb 10 '17 at 05:36
  • 3
    What do you mean precisely by *detrend*? – Matthew Gunn Feb 10 '17 at 08:01
  • 6
    I don't have the time to write a proper answer/document this, but in general serial correlation does not *bias* the results of a linear regression (it alters the appropriate computation of the standard errors, confidence intervals, etc.). This makes the classic two-stage approach (detrend, then analyze for correlation) sensible. (e.g. some googling of "serial correlation linear regression unbiased leads to http://fmwww.bc.edu/ec-c/f2010/228/EC228.f2010.nn12.pdf ) – Ben Bolker Feb 10 '17 at 14:21
  • 2
    Perhaps more importantly, the OLS estimator of the coefficient on a linear trend converges a whole order of magnitude faster (at a rate $n^{-3/2}$) to its true value than for stationary regressors ($n^{-1/2}$), which means you can consistently estimate the trend even if you neglect the stationary variables. This is in contrast to estimating the effects of stationary variables one by one, where you lose consistency if you omit variables. – Richard Hardy Feb 10 '17 at 15:13

3 Answers3

15

You're astute in sensing that there may be conflict between classical assumptions of ordinary least squares linear regression and the serial dependence commonly found in the time series setting.

Consider Assumption 1.2 (Strict Exogeneity) of Fumio Hayashi's Econometrics.

$$ \mathrm{E}[\epsilon_i \mid X] = 0 $$

This in turn implies $\mathrm{E}[\epsilon_i \mathbf{x}_j] = \mathbf{0}$, that any residual $\epsilon_i$ is orthogonal to any regressor $\mathbf{x}_j$. As Hayashi points out, this assumption is violated in the simplest autoregressive model.[1] Consider the AR(1) process:

$$y_{t} = \beta y_{t-1} + \epsilon_t$$

We can see that $y_t$ will be a regressor for $y_{t+1}$, but $\epsilon_t$ isn't orthogonal to $y_t$ (i.e. $\mathrm{E}[\epsilon_ty_t]\neq0$).

Since the strict exogeneity assumption is violated, none of the arguments that rely on that assumption can be applied to this simple AR(1) model!

So we have an intractable problem?

No, we don't! Estimating AR(1) models with ordinary least squares is entirely valid, standard behavior. Why can it still be ok?

Large sample, asymptotic arguments don't need strict exogeniety. A sufficient assumption (that can be used instead of strict exogeneity) is that the regressors are predetermined, that regressors are orthogonal to the contemporaneous error term. See Hayashi Chapter 2 for a full argument.

References

[1] Fumio Hayashi, Econometrics (2000), p. 35

[2] ibid., p. 134

Matthew Gunn
  • 20,541
  • 1
  • 47
  • 85
6

Basic least-squares type regression methods don't assume that the y-values are i.i.d. They assume that the residuals (i.e. y-value minus true trend) are i.i.d.

Other methods of regression exist which make different assumptions, but that'd probably be over-complicating this answer.

Geoffrey Brent
  • 590
  • 2
  • 6
  • 5
    Assumption which is also clearly false: just think of a time series with both a linear trend, and seasonality. The remaining residuals from linear regression are clearly correlated, thus not iid. – DeltaIV Feb 10 '17 at 07:13
3

It's a good question! The issue is not even mentioned on my time series books (I probably need better books :) First of all, note that you're not forced to use linear regression to detrend a time series, if the series has a stochastic trend (unit root) - you could simply take the first difference. But you do have to use linear regression, if the series has a deterministic trend. In this case it's true that the residuals are not iid , as you say. Just think of a series which has a linear trend, seasonal components, cyclic components, etc. all together - after linear regression the residuals are all but independent. The point is that you're not then using linear regression to make predictions or to form prediction intervals. It's just a part of your procedure for inference: you still need to apply other methods to arrive at uncorrelated residuals. So, while linear regression per se is not a valid inference procedure (it is not the correct statistical model) for most time series, a procedure which includes linear regression as one of its steps may be a valid model, if the model it assumes corresponds to the data generating process for the time series.

DeltaIV
  • 15,894
  • 4
  • 62
  • 104
  • 3
    Don't differentiate if you have a deterministic trend – differentiation is only appropriate for stochastic trends (unit roots). If you differentiate a series without a unit root, you will introduce integrated moving average type of errors in the model, and that is nasty. – Richard Hardy Feb 10 '17 at 10:20
  • 1
    I think you mean difference, not differentiate. – Hong Ooi Feb 10 '17 at 11:03
  • @RichardHardy interesting. What do you mean with "stochastic trend"? Do you mean cycles? Would $y_t=\beta_0+\beta_1 y_{t-1}+\epsilon_t$ have a stochastic or deterministic trend according to your definition? – DeltaIV Feb 10 '17 at 12:30
  • 1
    @HongOoi, yes, my bad, I meant differencing, not differentiation. DeltaIV, a time series is said to have a stochastic trend if the time series is an integrated (=unit-root) process. This is a standard term in unit-root and cointegration literature. I wonder if it has different meanings in other strands of literature. In any case, over-differencing (=differencing a time series that does not have a unit root) is a notorious phenomenon, and it should be avoided. – Richard Hardy Feb 10 '17 at 12:38
  • @RichardHardy ok, thanks. I will try to document myself about the definition of integrated process and unit roots. As a start, can you tell me if the series I proposed is integrated or not? Are the roots you refer to, the roots of the polynomial $y=\beta_0+beta_1 x_1$? – DeltaIV Feb 10 '17 at 12:52
  • $y_t=\beta_0+\beta_1y_{t-1}+\varepsilon_t$ will have a stochastic trend – and will be an integrated time series – if $\beta_1=1$. If $-1 – Richard Hardy Feb 10 '17 at 12:55
  • @RichardHardy very interesting - I may have to modify my answer a bit. So, if $\beta_1=1$, both differencing and linear regression would be applicable. If $|\beta_1|<1$ then differencing should be avoided, but I guess linear regression could still be used to estimate $\beta_0$ and $\beta_1$, right? See Matthew Gunn's answer. – DeltaIV Feb 10 '17 at 18:29
  • When $\beta_1=1$ another assumption of OLS is violated as $\text{Var}(y)=\infty$, but that is another story. There are threads about estimating AR(1) models with and without unit roots, I remember having started one of them. – Richard Hardy Feb 10 '17 at 21:05