Comparing regression quality for different dependent variables

Question

I have a problem where I predict different dependent variables from the same data set. Now I want to evaluate, how well each of the dependent variables can be predicted using a regression method. The dependent variables have different scales and units. Therefore, something like mean squared error won't help here. What is a common approach for such a problem?

What types of variable are the dependent variables? $R^2$ jumps out at me as being ideal, but it's jumping out so strongly that I can't help but think that I must be missing something! — Ian_Fin, Sep 15 '16 at 11:02
@Ian_Fin all are 1D variables of different scales. What else do you need to know? — languitar, Sep 15 '16 at 11:05
@Ian_Fin I was also thinking about $R^2$ but couldn't find any explanation that clearly states that this is the right tool for the job. — languitar, Sep 15 '16 at 11:06
are they continuous or discrete? Effectively what I'm trying to get as is whether you are using linear regression (where $R^2$ is clearly defined) or some other form of regression (where you may only get "pseudo" $R^2$s) — Ian_Fin, Sep 15 '16 at 11:07
@Ian_Fin They are (apart from discretization artifacts) continuous variables, all >= 0. — languitar, Sep 15 '16 at 11:10
$R^2$ sounds ideal then. You could effectively say that your set of IVs account for more variance in, e.g., DV1 than in DV2. This seems to be what you're looking for. — Ian_Fin, Sep 15 '16 at 11:13
$R^2$ strikes me as potentially being anywhere on the spectrum from wrong to misleading to confusing to OK. To determine whether it's appropriate, information about what you mean by "how well" is needed. Can you quantity the consequences of making prediction errors, as a function of the sizes of the errors themselves? — whuber, Sep 15 '16 at 15:01
@whuber I would already be happy with something like "5% from the average mean of the variable", however, MAPE as a metric doesn't work due to 0 values in the data. I have a feeling about which amount of error is acceptable, but I cannot quantify it. Which issues do you see with $R^2$? — languitar, Sep 16 '16 at 11:00
Some concerns are explained at http://stats.stackexchange.com/a/13317/919. — whuber, Sep 16 '16 at 13:13

score 1 · Accepted Answer · answered Sep 15 '16 at 11:07

1

You could use $R^2$ or one of its variants such as shrunken $R^2$ or AIC, if the models have different numbers of independent variables.

Or you could rescale the variables to some constant mean or range.

answered Sep 15 '16 at 11:07

Peter Flom

94,055
35
143
276

Thanks. The models all have the exact same set of independent variables and even values as I am trying to find out what can best be predicted from these variables. – languitar Sep 15 '16 at 11:08
Then $R^2$ should be fine. Another choice is MAD (but you would have to scale it), but since regression uses squared values you should probably use that (unless you change the method of regression). – Peter Flom Sep 15 '16 at 11:12
1

Doesn't MAD scale with the different ranges of the dependent variables? – languitar Sep 15 '16 at 11:17
Ooops, yes, you are right. – Peter Flom Sep 16 '16 at 11:32

Comparing regression quality for different dependent variables

1 Answers1

Linked