1

I have the true output as the ratio between 0 and 1. I am trying predict the output using regression. I am supposed to find the area under the ROC curve between the prediction and true values. I am not aware of ROC curve calculations for continuous variables. Any suggestions are welcome.

nth-attempt
  • 149
  • 1
  • 5
  • What complicates the problem for continuous variables? – Michael R. Chernick Jul 07 '17 at 03:09
  • As Andreas Mueller said and I quote "roc_auc is a classification or ranking metric, not a regression metric. So it doesn't accept continuous y" - amueller [here](https://github.com/scikit-learn/scikit-learn/issues/6592) and [here](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html#sklearn.metrics.roc_auc_score) – nth-attempt Jul 07 '17 at 03:13
  • You didn't make it clear what variable(s) are used in the regression to predict the ratio. The receiver operating characteristic curve shows how the probability of correct selection (in classification) varies as a threshold is varied. In regression you can predict the accuracy of the prediction. You could pick a threshold for the accuracy and declare the result to be successful if you achieve that accuracy. I suppose you could construct an ROC curve by varying the threshold. – Michael R. Chernick Jul 07 '17 at 03:32
  • I have both categorical and quantitative variables that are being used in regression. What do you mean by threshold? Should I threshold both the true output and the predicted output and vary that threshold to get an ROC curve? – nth-attempt Jul 07 '17 at 03:44
  • No I am talking about the absolute difference between predicted and actual. – Michael R. Chernick Jul 07 '17 at 03:50
  • I am totally lost. What is the threshold then? – nth-attempt Jul 07 '17 at 03:57
  • You set the threshold to c and then compute | actual-predicted| and compare it to c. It is a success if the difference is less than c. That gives you a point on the ROC. As you vary c from 0 to infinity the probability will increase from 0 to 1. – Michael R. Chernick Jul 07 '17 at 04:02
  • I got it. Thanks. But I really don't know how to interpret the ROC curve in this case. May be it is not a good idea to use ROC for regression – nth-attempt Jul 07 '17 at 04:21

0 Answers0