R-squared larger than one

Question

I have 10 response variables and used 10 weighted elastic net models to find which of the 31 predictors that I have in my system can better explain my responses.

I obtain an R-squared for my models and most of the models have high R-squared.

Next, among my weights, I put a threshold and only used the predictors which are most likely to be able to explain my model.

I ran the analysis again and now I got R-squared above one. I don't understand why it should give me R-squared larger than one.

Here is the forlmula how I calculate the R-square:

var(x*beta)/var(response)

I run the following over 10 responses:

with all predictors:

res <- cv.glmnet(x=regressors,y=responses[i,],lambda=c(0.01,0.05,0.1,0.5,1,1.5,2,10,‌20,100),nfolds=10,family="gaussian",standardize=T,type.measure="mse",intercept=F,‌penalty.factor=w,grouped=FALSE)

and here are my r-squared values:

[1] 0.2143036 0.8983216 0.1033970 0.4073570 0.7410773 0.9009351 0.3518317 0.8386557 0.1640106 0.4902337 0.9408415 0.7705011 0.8918895 0.0604311 

[15] 0.8324915 0.3142945 0.7603050 0.5791587 0.5458866 0.4644528 0.9424381 0.2226040 0.9106043 0.5826858 0.9370337 0.2573282 0.3955305 0.5008677 

[29] 0.8530356 0.9427917 0.3889714

When I reduce the number of predictors:

res <- cv.glmnet(x=regressors[,names(which(w!=1))],y=responses[i,],lambda=c(0.01,0.05,0‌.1,0.5,1,1.5,2,10,20,100),nfolds=10,family="gaussian",standardize=T,type.measure=‌"mse",intercept=F,penalty.factor=w[names(which(w!=1))],grouped=FALSE)

and r-squared for all responses are:

[1] 0.23608323 0.71910789 0.04624468 0.36666693 13.04262441 0.79911136 0.34117305 16.05521440 0.24017898 0.64007613 0.73259379 0.52822347 

[13] 0.36245020 1.02954292 0.62319234 1.21837174 0.48313160 0.70221289 7.40865390 2.18222146 0.41393762 1.33439668 0.72242256 0.59092254 

[25] 0.62969173 0.54824267 0.46230243 0.61607441 0.44151865 0.74692996 1.21428429

As usually defined $R^2$ cannot exceed 1 so in those terms it does appear but you have a problem. But nothing you mention can be checked by us. There is here no dataset, no indication of exact calculation procedures used, no precise code, no display of results from software. We need at least some of those to help. — Nick Cox, Oct 25 '15 at 13:10
However, increasing the number of predictors (i.e. adding extra predictors) cannot decrease $R^2$, so your last statement is at best confused. — Nick Cox, Oct 25 '15 at 13:58
I wanted to say that R-squared increases with the number of variables. I made a mistake in my statement — sbmm, Oct 25 '15 at 14:04
Nick Cox is right about the limited insight any of us can provide. But note that r-squares are defined correctly only for OLS models. With the various flavors of maximum likelihood, pseudo-rsquares are employed. So, while I'm leaning towards thinking that it's likely that some regression assumption is being violated, without knowing the precise metric that's the basis for your question, it's impossible to be sure. For instance, you don't say how many data points you have. — Mike Hunter, Oct 25 '15 at 14:12
One possible source of error is if your sample sizes are so small that you are way, way overfitting your models with 10 variables. Also, why would 10 iterations be adequate for 100 variables? Why not 1,000? Or more? — Mike Hunter, Oct 25 '15 at 14:12
@DJohnson I have a response of size 177, and regressors of size 15000*177 in the first model.... Then I only select few regressors which based on their weight are more likely to be related with the corresponding response — sbmm, Oct 25 '15 at 14:16
@ DJohnson what do you mean by 1000 iteration? i think there is a misunderstanding here! I said imagine if I have 10 responses and I run the model separately for each of my response. — sbmm, Oct 25 '15 at 14:18
@sbmm To clarify, your data matrix is 177 by 15,000? In other words, you have 177 observations and 15,000 possible predictors? Is this correct? — Mike Hunter, Oct 25 '15 at 14:43
Well, if you can break the unity barrier in probability http://stats.stackexchange.com/a/160979/78964 , I suppose you could as well with R-squared. :) — Mark L. Stone, Oct 25 '15 at 16:04
@NickCox Would just like to mention that increasing the number of predictors cannot decrease $R^2$, but can decrease adjusted $R^2$. — Chris C, Oct 25 '15 at 17:34
@sbmm Given such a data matrix, it seems highly likely that you're exhausting the degrees of freedom and this probably accounts for why your pseudo-Rsquared metric is greater than one. How are these 15,000 predictors scaled? Continuous? Discrete? — Mike Hunter, Oct 25 '15 at 18:53
@sbmm You would do well to review David Dunson's papers since he's quite interested in approaches to modeling very high dimensional data with small n. *Bayesian Tensor Regression* is one of them. Check out his Duke website: https://stat.duke.edu/~dunson/ — Mike Hunter, Oct 25 '15 at 22:05

R-squared larger than one

0 Answers0