Loss function for discreticized regression ("classification")?

Question

I have a particular classification problem that I think should be solved through regression with discretization of final values.

I have a dataset to predict monster level, which is from a set of values {1/8, 1/4, 1/2, 1, 2, 3, ..., 30}. Value set is ordered and finite (discrete). Therefore, it's a classification problem - based on new monster data, I want to assign to it a single class (power level). But since values are ordered, I think I should treat this as a regression problem, get a real value for monster power and round it to the nearest value from the monster levels set, discretizing it and turning it into a classification.

Also, during the prediction the values of classes unseen during training (e. g. monsters more powerful than any from the training set) may be encountered, so while the class set is finite, it is not exhaustive.

What loss function should I use? I know functions for traditional regression like MSE, MAE or Huber loss, but I haven't seen any loss functions for discretized regression, or for classification with ordered classes.

I think you're correct that it makes sense to think of CR categories as ordinal. In that case, the key term of art is "ordinal logistic regression," which is used for ordered categories like "poor, fair, good." Clearly we can't do arithmetic on monster difficulties, but we can sort them. — Sycorax, Sep 25 '20 at 18:25
Does this answer your question? [Evaluating multiclass imbalanced problem per class](https://stats.stackexchange.com/questions/487444/evaluating-multiclass-imbalanced-problem-per-class) — Stephan Kolassa, Sep 25 '20 at 19:15
@Sycorax thanks, I'll look into it! Stephan Kolassa - no, as far as I can see it has nothing to do with my question. — qalis, Sep 25 '20 at 19:20

Sycorax · Accepted Answer · 2020-09-28T20:12:09.300

I think you're correct that it makes sense to think of CR categories as ordinal. In that case, the term of art you seek is "ordinal logistic regression," which is used for categories that can be ordered like "poor, fair, good." (By contrast, a category that isn't inherently ordered is a person's college major: History isn't "more of a BA" than Economics or any other discipline.)

We can't do arithmetic on monster difficulties, but we can sort them. What I mean by "arithmetic" is that adding up CRs produces a different encounter difficulty than if we follow the guidelines and calculations in the Dungeon Master's Guide; eight monsters with CR $\frac{1}{8}$ have a different difficulty than a single CR 1 monster. Because this part of the answer depends on knowledge of D&D 5e, as opposed to statistics, I won't elaborate on it further.

Loss function for discreticized regression ("classification")?

1 Answers1