0

In a relatively simple setting, I try to fit a simple linear regression model. I want to assess how good this model for prediction. basically I try to do something like K-fold cross validation or jackknifing (that is - fit the model on part of the data, and predict the rest to see what happens). However, I seek a single measure that captures the overall "predictive power". I guess there are some theoretical solutions to that, so any reference/link will be appreciated.

In the meantime, what I tried to do is to do cross-validation, and for every case in my sample I now have the observed value, the predicted value (based on some subset of the data), and the residuals. In the full linear model we know that $\sum(y^2) = \sum(\hat{y}^2)+\sum(e^2)$ but when using the predicted values on a subset this is likely to change so I define:

$$M = \sum(\hat{Y}^2)+\sum(e^2)$$

based on prediction

and use

$$F = \frac{(M-\sum y^2)}{\sum (y^2)}$$

as a measure of how much the prediction deviates from the full linear model, with 0 means the best possible.

I'd appreciate any feedback on this method and/or references to similar analyses.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
amit
  • 541
  • 3
  • 10
  • amit. CrossValidated uses LaTeX for better formatting. I've edited your post. Make sure that it reads the way you'd like it to. – Eric Peterson Jun 04 '13 at 15:18
  • 1
    Perhaps your question is answered at http://stats.stackexchange.com/questions/9131? If not, what else would you be looking for? – whuber Jun 04 '13 at 15:22

0 Answers0