I have a problem where I predict different dependent variables from the same data set. Now I want to evaluate, how well each of the dependent variables can be predicted using a regression method. The dependent variables have different scales and units. Therefore, something like mean squared error won't help here. What is a common approach for such a problem?
Asked
Active
Viewed 163 times
0
-
2What types of variable are the dependent variables? $R^2$ jumps out at me as being ideal, but it's jumping out so strongly that I can't help but think that I must be missing something! – Ian_Fin Sep 15 '16 at 11:02
-
@Ian_Fin all are 1D variables of different scales. What else do you need to know? – languitar Sep 15 '16 at 11:05
-
@Ian_Fin I was also thinking about $R^2$ but couldn't find any explanation that clearly states that this is the right tool for the job. – languitar Sep 15 '16 at 11:06
-
are they continuous or discrete? Effectively what I'm trying to get as is whether you are using linear regression (where $R^2$ is clearly defined) or some other form of regression (where you may only get "pseudo" $R^2$s) – Ian_Fin Sep 15 '16 at 11:07
-
@Ian_Fin They are (apart from discretization artifacts) continuous variables, all >= 0. – languitar Sep 15 '16 at 11:10
-
1$R^2$ sounds ideal then. You could effectively say that your set of IVs account for more variance in, e.g., DV1 than in DV2. This seems to be what you're looking for. – Ian_Fin Sep 15 '16 at 11:13
-
$R^2$ strikes me as potentially being anywhere on the spectrum from wrong to misleading to confusing to OK. To determine whether it's appropriate, information about what you mean by "how well" is needed. Can you quantity the consequences of making prediction errors, as a function of the sizes of the errors themselves? – whuber Sep 15 '16 at 15:01
-
@whuber I would already be happy with something like "5% from the average mean of the variable", however, MAPE as a metric doesn't work due to 0 values in the data. I have a feeling about which amount of error is acceptable, but I cannot quantify it. Which issues do you see with $R^2$? – languitar Sep 16 '16 at 11:00
-
Some concerns are explained at http://stats.stackexchange.com/a/13317/919. – whuber Sep 16 '16 at 13:13
1 Answers
1
You could use $R^2$ or one of its variants such as shrunken $R^2$ or AIC, if the models have different numbers of independent variables.
Or you could rescale the variables to some constant mean or range.

Peter Flom
- 94,055
- 35
- 143
- 276
-
Thanks. The models all have the exact same set of independent variables and even values as I am trying to find out what can best be predicted from these variables. – languitar Sep 15 '16 at 11:08
-
Then $R^2$ should be fine. Another choice is MAD (but you would have to scale it), but since regression uses squared values you should probably use that (unless you change the method of regression). – Peter Flom Sep 15 '16 at 11:12
-
1Doesn't MAD scale with the different ranges of the dependent variables? – languitar Sep 15 '16 at 11:17
-