0

I have a time series linear regression with a really high R-Squared. I'm a little bit apprehensive to start using it though because of the residual plot. I know that the residuals are supposed to be random and have a mean of 0.

Here is a plot of my residuals:

enter image description here

The mean is approximately 0, but as you can see, the residuals do not look like the nice little cloud that are always in the linear regression examples. The residuals are particularly low around 20650 and particularly high a little above 20700 with a little bit of a parabola look between 20500 and 20700.

I'm wondering if I should not be content with my model until the residuals look like the perfect cloud of random points or if that is overly idealistic. Are there any statistical procedures to measure whether or not the residual assumptions are met?

Jarom
  • 231
  • 2
  • 10
  • What quantity are you plotting on the horizontal axis? – whuber Aug 09 '17 at 19:14
  • Thank you, but that's irrelevant: the question is *what is it*. Are you plotting one of the regressor variables? The fitted values? The response? Another variable not included in the regression? – whuber Aug 09 '17 at 19:52
  • @whuber sorry, I misunderstood your question. I realize that I left a key piece out of my question which I will edit in. The horizontal axis is time. This is a time series model using linear regression. – Jarom Aug 09 '17 at 19:55
  • what are you regressing against? Also time? What's the acf and pacf of the residuals like? What's the acf and pacf of the raw values? I would wonder for example if this response might be stationary when differenced, which might make one wonder if the correlation you're getting means much of anything. – Glen_b Aug 10 '17 at 02:42
  • You may find some of this discussion of use: https://stats.stackexchange.com/questions/133155/how-to-use-pearson-correlation-correctly-with-time-series/ – Glen_b Aug 10 '17 at 12:05

0 Answers0