Model uncertainty (model averaging) and R-Squared ($R^2$)

Question

Is it possible to calculate r-squared for an "average model"?

Lets say I have 4 different response variables that I want to model to a set (or subset) of 4 independent variables. I'd then like to compare the variance explained (R-squared) for the best model for each response variable. Unfortunately, for each of the 4 response variables there is no clear best model, so I need to model average to account for this model selection uncertainty. Now, is there still a way to compare the variance explained by the averaged model for each of the response variables? Or do I simply compare the r-squared values of the global model for each response variable?

My question is similar to this old unanswered question; Generalized $R^2$ for average model

Thanks for any help and please let me know if I can improve this question.

score 2 · Answer 1 · answered Dec 12 '14 at 23:40

2

The following book addresses model selection and averaging, including frequentist and Bayesian approaches, use of AIC and BIC (an alternative to R2 in your case?) and more: http://www.cambridge.org/us/academic/subjects/statistics-probability/statistical-theory-and-methods/model-selection-and-model-averaging.

I may be wrong, but I assume that your question, possibly indirectly, is related to ensemble methods (http://en.wikipedia.org/wiki/Ensemble_learning) and associated topics and methods. I was curious about your question and did some brief research. I'm not sure whether Bayesian approach can be applied to your models, but I hope that the following resources will be helpful for you in regard to your question.

Combining models, Bayesian:

A tutorial on Bayesian model averaging (BMA): http://www.stat.washington.edu/www/research/online/hoeting1999.pdf
Ensemble BMA and forecasting: http://andrewgelman.com/wp-content/uploads/2014/03/EBMA_conditions6.pdf
Bayesian model combination instead of BMA: http://axon.cs.byu.edu/papers/Kristine.ijcnn2011.pdf

Combining/comparing models, non-Bayesian:

The 2008 BellKor Solution to the Netflix Prize: http://www2.research.att.com/~volinsky/netflix/Bellkor2008.pdf
Paper on average predictive comparisons (APC) of models with non-linearity or interactions: http://www.stat.columbia.edu/~gelman/research/published/ape17.pdf
Blog post on APC method R implementation: http://andrewgelman.com/2014/06/17/average-predictive-comparisons-r-david-chudzicki-writes-package
Corresponding R package: http://www.davidchudzicki.com/predcomps

answered Dec 12 '14 at 23:40

Aleksandr Blekh

7,867
2
27
93

1

Thanks for your reply! I am familiar with how to model average model parameter estimates using AIC methods (sensu Burnham and Anderson 2002 "Model Selection and Multimodel inference". It is essentially computing a weighted average for each coefficient across a set of models, weighted by a given models support. However I'm not certain if it is sensible to model-average the r-squared value of each model in the same way. – Dave M Dec 13 '14 at 19:15
@DaveM: You're welcome! If I understood correctly, in the Netflix Prize paper I referenced, the authors averaged RMSE values across various models, when combining them. That is why I thought that similar approach might be applicable to R-squared as well. I understand the difference between RMSE (measures **absolute** *GoF*) and R-squared (measures **relative** *GoF*), but they are nevertheless related (Adj. R^2 = 1 - SSE / SST = 1 - DF * RMSE^2 / SST), hence my suggestion. (to be continued) – Aleksandr Blekh Dec 14 '14 at 02:43
@DaveM: (cont'd) However, in that case the authors combine models, but you want to produce a single averaged measure for a set of models. I'm not sure, if it even makes sense, because it is unclear what your desired GoF index will measure and, thus, it seems **too artificial**. But that is just my opinion - I'm sure that some experts will chime in sooner or later and will clarify this. – Aleksandr Blekh Dec 14 '14 at 02:44
1

I believe in my case, where I want to compare goodness-of-fit for 4 different averaged models (i.e., averaged models for 4 different response variables) RMSE would be inappropriate, given it's scale-dependence. I think were I am stuck now, is do I calculate a weighted average R2 for each model set, or do I calculate a single R2 based on the the predictions vs. observed values of the average model? Perhaps time for a new, separate question. – Dave M Dec 15 '14 at 16:21
@DaveM: I see. If you'd like to use criterion, which is on the same scale as original data, consider using **standard error of regression**. See [my another answer](http://stats.stackexchange.com/a/129147/31372) for more details. – Aleksandr Blekh Dec 15 '14 at 20:23
1

Why not use propagation of error? – Carl Aug 11 '16 at 03:30
@Carl: It's an interesting idea (+1), though I'm not sure how to use the propagation of error concept in model averaging context. Please feel free to expand your idea as an answer - I'm curious to learn more about that. – Aleksandr Blekh Aug 11 '16 at 03:48

Carl · Answer 2 · 2016-08-19T22:03:10.867

I am not entirely sure how. I think you could work it through. Basically, propagation of error is the total derivative of your fit equation, with covariances, example. When you do your fit routine, with what ever method is appropriate to your problem, just get make sure to use one that gives you the standard deviations of all of the parameters. Then you can solve for any combination of parameters' variances that you want to. I did that adaptively in the Appendix section of this paper. That is, I not only solved the error propagation for my parameter of interest, but I optimized for it.

For your problem, for the total error of each of the dependent variables, you have a contribution from each of the error propagation terms. Now the added confusion of using a "population model," depends on exactly how you obtain it, and after that I am lost in what you are doing, exactly. However, I do not think that it changes the rules for error propagation, just what the errors themselves are.

I am going to take a guess as to what an average model is, and mention that if that means that we fit models separately, then average the results, that one might start with a mixture model, and fit that, as following that one might get a better fit, and, then calculation of $R^2$ is routine. For example, $s (p b_1 e^{-b_1 t} +(1-p) b_2 e^{-b_2 t})$ is a mixture model scaled by $s$ times a unit CDF($\infty$) consisting of the sum of two independent exponential distributions, that is one of the most commonly used pharamcokinetic models. If you are doing autocorrelation or some other averaging, please let me know and I will change my suggestion to match it, if I can.

Another possibility is to noise reduce the data using adaptive B-splines or locally adaptive Pixon. What should be done depends on what the purpose of obtaining an average is.

Model uncertainty (model averaging) and R-Squared ($R^2$)

2 Answers2

Linked