F1 score, PR or ROC curve for regression

Question

Due to my background as a pure biologist, I've been struggling with the comment acquired from a reviewer about the accuracy test used in my regression study. While I stick to MSE, MAE and R2 as the parameters to determine accuracy of my regression model (Support Vector Regression and Simple Linear Regression), one reviewer asks me to perform F1 score, PR or ROC curve with the data.

The reviewer has noted that "F1 score, PR or ROC curve are not specific to classification models only." With my limited knowledge, I cannot find any evidences of applying such parameters with regression study.

It would be very kind of you if anyone could provide me the source of such application. Either R or python packages for applying such test with regression study would be really appreciated.

Best regards, Kaj

I assume that the reviewer must think you're performing a *logistic* regression. If you include the term "logistic" in your google search, you'll find more relevant references. — JimB, Nov 18 '20 at 04:00
Thank you very much for your response!! The fact is that I've performed Support Vector Regression, and Simple Linear Regression. I do not have any clue about such parameters in these models. I would improve my question accordingly. — Tofu King, Nov 18 '20 at 04:05
I hadn’t considered that, but @JimB raises a reasonable point. Are you doing a logistic regression? Even the Wikipedia article on ROC gives some references to extensions to continuous responses, so it isn’t that obscure, but it would make more sense to ask for ROC if the regression is a logistic regression. — Dave, Nov 18 '20 at 04:05
I'm so sorry for my unclear question. I've just edited the question to include the regression models I'm considering. — Tofu King, Nov 18 '20 at 04:10
Maybe the reviewer expects you to bin predictions and true outputs, thus producing a "classes", and applying classification metrics to this. Perhaps they want it so that they can see interpretable results like "90% observations with values between 0 and 1 were predicted correctly". — Euphe, Nov 18 '20 at 07:42
I agree with this idea. However, It would be very nice if the reviewer provide me the clue to define each class for me (like low, medium, and high error classes). — Tofu King, Nov 18 '20 at 08:48

score 8 · Accepted Answer · answered Nov 18 '20 at 06:27

8

F1 score, PR or ROC curve are not specific to classification models only.

I have never seen the F1 score or ROC used to evaluate a numerical prediction. I am unfamiliar with "PR".

The definition of the F1 score crucially relies on precision and recall, or positive/negative predictive value, and I do not see how it can reasonably be generalized to a numerical forecast.

The ROC curve plots the true positive rate against the false positive rate as a threshold varies. Again, it relies on a notion of "true positive" and "false positive", and I don't see how these can be applied to numerical predictions.

All that is not to say that efforts have not been made to apply these concepts to numerical forecasts.

It would feel a lot like hammering square pegs into round holes to me, though. I would say that there is a reason why I (we?) haven't seen this a lot: it's unintuitive, and it does not provide the information that standard error measures like the MAE or the MSE do. Honestly, if I got a paper for review that used F1/ROC to evaluate numerical predictions, I would recommend that they throw these out and use more standard error measures.

My recommendation: ask the editor to communicate to the reviewer that you need more information on applying F1 and ROC in your case. Maybe the reviewer can provide a reference or two? You may want to provide a link to this CV thread as an indication that you did do your homework and asked statistical experts (cough), and that the experts were similarly bewildered.

The best possible outcome would be if your reviewer posted their thoughts here.

answered Nov 18 '20 at 06:27

Stephan Kolassa

95,027
13
197
357

1

just a guess, but "PR" could simply be short for Precision & Recall – nope Nov 18 '20 at 06:43
I politely implied the reviewer for something like 'provision of evidence'. It seemed to make the reviewer really angry. But the reviewer did not provide me any source as well. Sadly, I've already got rejection with this in my mind. After reading your answer, I would like to thank you for making me feel normal again. – Tofu King Nov 18 '20 at 07:03
3

@TofuKing you need to be careful with the wording. Don't "imply" and don't invoke principles such as "provision of evidence". That's not helpful, and likely to annoy the reviewer. Show that you've done your homework, state the facts ("We know ROC/PR/F1 are defined on binary classification tasks, have been extended to multiclass classification, but we're not aware of their application in regression") and ask the reviewer for clarification ("Could the reviewer please point you to the literature they have in mind?"). – Calimo Nov 18 '20 at 07:17
2

@TofuKing I am with Calimo for being polite in your skepticism of the reviewer. We have an academia Stack if you want to post there about how to handle a reviewer who appears to be dead wrong. I do wonder if it would help your paper to point out that support vector regression is different from the support vector machine classifier, hence the use of the performance metrics you have chosen. The reviewer might be wrong and even vindictive about having that pointed out, but if a professional is confused, some of that is on you to clarify in the article. – Dave Nov 18 '20 at 11:07
2

That said, good for you for having the courage to (politely!) tell the reviewer, “I think you’re wrong.” – Dave Nov 18 '20 at 11:49
@Dave Thank you very much for your help!! Anyway, I've just submitted my work to a new journal. In case of similar problem (really hope not!!), your kind suggestion would be extremely appreciated. – Tofu King Nov 19 '20 at 11:06

F1 score, PR or ROC curve for regression

1 Answers1