1

NOT A DUPLICATE. These questions are related, but mine is asking about a specific application of classifiers - flagging Stack Exchange posts. I want to know which of the 2 methods is most effective for THIS TYPE OF JOB. This question is different also because the true/false classifiers are run in a specific order (most priority to least priority). UNMARK AS DUPLICATE!

I want to write a program that can automatically review the Stack Overflow Triage queue. It will classify text as "Looks OK," "Requires Editing," and "Unsalvageable." If it's unsalvageable, it will then classify the text into a specific close reason.

I've come up with 2 methods to do this:

  1. Use multilabel classification to classify the text into one of the closure categories. (No category has priority over another)

  2. Go through the close reasons individually, in a hardcoded order, and decide if it belongs in the category or not. If it belongs, it flags the question. If it doesn't, it moves to the next close reason and repeats. If it hasn't been identified by the end, the question is determined to be good-quality. (The order in which the program checks the classes is important)

I don't know which way is more effective. I thought of the 2nd way because a question that is both off-topic and offensive should be flagged just as offensive. Therefore, "offensive" has a higher priority than "off-topic."

Priority order:

  1. Offensive?
  2. Spam?
  3. Very Low Quality?
  4. Blatantly Off-Topic?
  5. Unclear?
  6. Recommendation?
  7. Super User/Server Fault
  8. Opinionated?
  9. Too Broad?
  10. Vague Debug?
  11. No Repro/Typo?
  12. Requires Editing/Looks OK

Which of the 2 algorithms gets the job done better?

clickbait
  • 111
  • 5
  • 2
    We don't know, for the same reason we don't know *any* particular predictive algorithm A is better than B for a use case X. Implement both, test them on actual data, using whatever KPI operationalizes "better" for you ([not accuracy](https://stats.stackexchange.com/q/312780/1352)), and tell us which algorithm performed better. – Stephan Kolassa Jun 29 '18 at 07:54
  • 2
    I voted to close, but certainly not as a duplicate, which I agree it isn't. I believe I put "unclear what you are asking". As per my comment, you are essentially asking us to answer the very question a statistical analysis would be needed to answer. We are not clairvoyants. – Stephan Kolassa Jun 29 '18 at 18:25
  • 2
    Sorry I missed the label priority: it's indeed not a duplicate. I voted to reopen: I agree the question wording is specific to a narrow domain, but it could be rephrased to be more general in order to attract answers pointing to 1) other possible methods to approach this task 2) papers who look at the same problem settings to see what kind of results they got (results are indeed domain dependent but could help guess which approach may work on average). – Franck Dernoncourt Jul 05 '18 at 18:19

0 Answers0