3

I have a time series data, with 5-period rates sampled at 1 period intervals. Essentially $$r_i = x_{i+5} - x_i ; ~~ i=1,...,N-5$$ This creates an overlapping data problem for regression. As far as I understand OLS estimates are unbiased but inefficient in this case. And we have methods like GLS or Newey-West to correct this.

Are there similar properties of the correlation coefficient as well?
Are there good ways to estimate the coefficient other than simply bootstrapping?

I face this in a primarily learning scenario, and don't want to go for Newey-West like estimators.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
ssj3892414
  • 61
  • 5
  • Welcome to CV! I corrected the formatting of your question. In your questions and answers you can use Tex formatting enclosed in dollar signs (`$`) on both sides of equation. This makes a formula more readable. – Tim Dec 22 '14 at 14:26
  • Do you mean $x_{i+5}$?, the current formula will only leave a constant of 5. – Penguin_Knight Dec 22 '14 at 14:27
  • Strictly speaking, using Newey-West type of covariance matrix does not cure the problem of inefficiency of OLS estimators. Using an appropriate covariance matrix means we acknowledge that they are inefficient, that's it, but nothing is done to remove the inefficiency. (I hope I am not mistaken.) GLS might be a better solution because it alters the point estimates to gain efficiency. – Richard Hardy Dec 22 '14 at 20:37
  • Could you elaborate on your questions? What do you mean by *the correlation coefficient*? Is it $R^2$? What kind of bootstrapping do you have in mind? – Richard Hardy Dec 22 '14 at 20:44
  • @Tim Thanks. I'll keep this formatting thing in mind. – ssj3892414 Dec 22 '14 at 21:47
  • @Richard by correlation coefficient I mean the correlation. I seem to have read somwehre that while in principle GLS works its not really good in these cases. As far as bootstrapping is concerned , I was thinking of sampling, and then verifying via cross validation – ssj3892414 Dec 22 '14 at 21:53
  • Correlation between what and what? Perhaps between $r_{i}$ and $r_{i-1}$? – Richard Hardy Dec 23 '14 at 07:12
  • @Richard $r_{i}$ is my depednant variable which I want to model. There are other variables $v_{i}$ which will be the dependants in a regression.By correltation I mean the correlation of r and v. – ssj3892414 Dec 25 '14 at 04:52
  • I think correlation is just what it is; the correlation formula fully defines it. I don't think correlation can be efficient or inefficient, for example. Whether it is a relevant measure for a given problem is a different question. – Richard Hardy Dec 25 '14 at 12:44
  • @Richard Probably efficiency is a wierd concept for correlation.But I get the feeling that correlation is somethign which is biased, in the sense that if on the same data without overlapping window, the correlation will come out to be different. – ssj3892414 Dec 25 '14 at 15:53
  • I get your point. That suggests you might benefit from reformulating your question in terms of more relevant measures than correlation. Try to ask yourself what exactly is of interest for you, then formulate a corresponding model and specify a question in terms of that model. Sorry if that's too abstract, but that's how I would approach it. – Richard Hardy Dec 25 '14 at 16:37
  • 2
    possible duplicate of [Time series regression with overlapping data](http://stats.stackexchange.com/questions/8373/time-series-regression-with-overlapping-data) – RockScience Sep 04 '15 at 04:31
  • 2
    possible duplicate of [What is this method for seasonal adjustment calculation?](http://stats.stackexchange.com/questions/17662/what-is-this-method-for-seasonal-adjustment-calculation) – StasK Sep 05 '15 at 01:57

0 Answers0