0

In the house price prediction example of Practical Statistics for Data Scientists (by Bruce, Bruce, and Gedeck; p. 168), the median residual for each zip code is taken and binned. The authors note that "using the residuals to help guide the regression fitting is a fundamental step in the modeling process".

Is it right that you can take the residuals from an initial model, and use them (or transform/combine them with other variables) to improve the model further? I was trying to look for further examples online and on this site but I can't seem to find the right search results for this.

Chiara
  • 1
  • Be careful. In their example, there are 82 zip codes and several zip codes have only one sale. They just use residuals to form five zip code groups. Residuals are always used to improve a model by regression diagnostics, but that's another matter. – Sergio Sep 14 '20 at 16:39
  • @Sergio So are residuals only used in diagnostics/model assessment (e.g residual plots and heteroskedasticity) or is it ever advisable to include them as a feature for the model? – Chiara Sep 15 '20 at 12:24
  • *All* of multiple regression can be viewed as systematically including residuals, one variable at a time, as the regressors in a succession of improved models. See https://stats.stackexchange.com/a/46508/919. That gives you about $10^8$ examples! – whuber Sep 16 '20 at 17:00
  • Seems that boosting a tree based model has this approach: use the residuals to build the next tree – igorkf Sep 16 '20 at 17:09

1 Answers1

0

Using the residuals as a dependent variable is one approach considered under Propensity Matching Scores. Binning the residuals adds any number of categories that can be used in the analysis. The underlying assumption is that the residuals reflect an underlying structure of the data that is not captured by a measured independent variable. This paper explains the general principles of Propensity Matching Scores using R Randolph, J. J., Falbe, K., Manuel, A. K., & Balloun, J. L. (2014). A Step-by-Step Guide to Propensity Score Matching in R. Practical Assessment, Research & Evaluation, 19(18). This paper demonstrates the use of Propensity Matching Scores in an actual analysis Huang, I. C., Frangakis, C., Dominici, F., Diette, G. B., & Wu, A. W. (2005). Application of a propensity score approach for risk adjustment in profiling multiple physician groups on asthma care. Health Serv Res, 40(1), 253-278.

LDBerriz
  • 535
  • 3
  • 9