1

I am using different regression models that can predict an ordered categorical variable from a metric variable. For example, I want to regress Happiness (in 1-5 ratings) on Money (a metric variable) using ordered probit regression:

$Happiness \sim log(Dollars)$

But I came across a fundamental question: How can I compare the accuracy of different models in predicting ordered categorical data?

If I do an ordered probit regression, cross-validating model prediction accuracy with 80% data for training and 20% for validation, I can compare the probability distribution of real data with what probit provided, but in this way, the order is ignored, and for example if it predicts every rating $Y$ with the opposite rating $(Y+4)\%5+1$ (exactly opposite), the result doesn't change!

So, what should I do? I found some visual correlation measures, but nothing quantitative.

Happiness~log(Dollars)

Figure 1. Fictitious report for a probit regression, based on this logit regression

mdewey
  • 16,541
  • 22
  • 30
  • 57
Ho1
  • 327
  • 5
  • 14
  • 2
    See http://stats.stackexchange.com/questions/174174/how-to-predict-using-ordered-probit-regression-and-calculate-prediction-accuracy – Frank Harrell Sep 26 '15 at 13:17
  • When you say "I can compare the probability distribution of real data with what probit provided, but in this way, the order is ignored" ... what are you doing to compare them? – Glen_b Sep 26 '15 at 23:45

0 Answers0