My goal is to determine whether for linear regression some predictors uniquely improve the fit beyond that which is already available via all other predictors combined. I have originally tried multi-way ANOVA and partial correlation for this purpose. I have recently learned that multi-way ANOVA performs poorly in the scenario with high multicollinearity. Namely, the reported significances and explained variances of individual predictors are not robust, and thus may misrepresent the true relations within the data.
Here is a solution that came to my mind:
- Fit the full model to the data, find the coefficient of determination $r^2_{full}$
- Exclude one of the predictors (e.g. $X$) from the model, fit the rest of the predictors, find $r^2_{/X}$
- Then the gain of explained variance uniquely due to the excluded predictor $X$ is
$$G(X) = r^2_{full} - r^2_{/X}$$
Naively, this looks like a robust solution for detecting partial effects. My simulations show that it works better than partial correlation on some simple noisy model data, whereas the latter is known to fail to correctly discriminate between a true partial effect and multicollinearity in the presence of noise.
Questions:
- Does this approach have a name?
- Does it work in practice?
- Is there a nice procedure to test $G(X)$ for significance (against null hypothesis that $X$ is random and can only explain variance by chance)? Permutation-testing seems to work for me, I'm just wondering if there is something similar to an F-test.
Note: I am only interested in applying the method for low total number of predictors such as 2 or 3. I am aware of the kitchen sink regression effect, so I want to make it clear that I do not intend to stretch this design to the extreme.