I'm reading Linear Models with R, 2nd Ed., by Julian Faraway, and he says something puzzling in Chapter 4 on page 51:
There are two kinds of predictions made from regression models. One is a predicted mean response and the other is a prediction of a future observation. To make the distinction clear, suppose we have built a regression model that predicts the rental price of houses in a given area based on predictors such as the number of bedrooms and closeness to a major highway. There are two kinds of predictions that can be made for a given $x_0$:
- Suppose a specific house comes on the market with characteristics $x_0.$ Its rental price will be $x_0^T\beta+\varepsilon.$ Since $E\varepsilon=0,$ the predicted price is $x_0^T\hat\beta,$ but in assessing the variance of this prediction, we must include the variance of $\varepsilon.$
- Suppose we ask the question - "What would a house with characteristics $x_0$ rent for on average?" This selling price is $x_0^T\beta$ and is again predicted by $x_0^T\hat\beta$ but now only the variance in $\hat\beta$ needs to be taken into account.
Note: I think it only fair to ignore the distinction between selling and renting. That distinction is not the author's main point, here, clearly.
Another note: As this is not a homework problem, I am going to leave off the self-study
tag.
Final note: this question says nothing about "forecasting", so the questions on Stats.SE concerning forecasting are not relevant.
My Question: Why can we ignore the variance of $\varepsilon$ in the prediction-of-mean problem (2.), but not the prediction-of-future-observation problem (1.)? I'm also not entirely sure I understand the difference between these two predictions. If you could please clarify that difference, I would be most grateful.
[EDIT] Apparently, predicting a future value corresponds to a prediction interval, while predicting a mean value corresponds to a confidence interval.