How to interpret R-squared in multiple regression with more sets of dummies and continuous variables?

Question

I have a problem with a multiple regression I performed:

model without constant term;
one dependent continuous variable;
first set of dummies: derived from 2 continuous variables, I used the median value of them as a threshold to derive two binary variables; from these two binaries, I derived 4 dummies, one for each combination (10, 01, 00, 11);
second set of dummies: 3 dummies derived from one categorical variable;
two continuous variables.

This model has a r-squared value of 98% (and similar adjusted r squared): I think it is too high, but I don't know how to interpret it correctly and assess its eventual validity; I know that r squared tend to increase with the number of explanatory variables, but I don't know if the number of dummies has an influence in its value and validity as an indicator of a good regression. Moreover, this model present high VIF values, indicating collinearity: are these measures still valid or not?

I have to say I have also tested the model with constant term (and $k-1$ and $n-1$ dummies), which has a very low r squared (around 10%) but no collinearity problems: I would use this model if only I could separate the effect of the two reference dummies on the constant term (and I don't know how to do it).

It is usually a bad idea to have regression without a constant term unless there is strong reason to do so. — Peter Flom, Aug 13 '12 at 11:20
2 notes: first, $R^2$ doesn't mean the same thing / can't be interpreted the same way as normal when the model is fit w/o a constant term (see: [removal-of-statistically-significant-intercept-term-boosts-r2-in-linear-model](http://stats.stackexchange.com/questions/26176/26205#26205)); second, it is poor statistical practice to dichotomize a continuous variable (eg a median split) & use dummies in it's place (see here after 'update': [how-to-choose-between-anova-and-ancova-in-a-designed-experiment](http://stats.stackexchange.com/questions/24077/24080#24080)). — gung - Reinstate Monica, Aug 13 '12 at 13:30

How to interpret R-squared in multiple regression with more sets of dummies and continuous variables?

0 Answers0