What measures of an ML algorithm's 'accuracy' are mostly consistent as the number of classes to predict into varies?

Question

For a research project, I've got a bunch (N=507) of 20-second VR tracking data clips (6DOF x head and hands), each from a different participant. My goal is to predict the participant using a small subset of each data stream (say 3-5 seconds of each 20-second clip). Obviously, one of the main influencers of the algorithm's accuracy is how many participants it needs to discern between - it's a lot easier to get things right with 3 than 50 or 500.

The number of participants we have is arbitrary, so I don't feel like it makes sense to report accuracy at any particular level. I did notice, though, that the accuracy in my baseline model (kNN on only height) has a linear pattern in log-odds to log-participants space, and that it's a mostly-constant offset above chance. I like this measure of "odds ratio above change" because it seems to be relatively consistent as the number of participants varies. On the other hand, it's pretty unnecessary as a measure if there are other ways to express this fact, especially ones that are validated by previous literature.

My first question is: are there more standard ways to report some analog to 'accuracy' across a varying number of classes?

If not, does this measure work? Why? [I ran some simulated data that involved normal samples from "participants" with a normal mean, and by varying how spread out these participant's individual means were relative to the noise they got, the odds-ratio-over-chance changed, but at each level it was mostly consistent among different numbers of participants.]

I am a PhD student, but machine learning isn't my primary field, so links to research papers are especially welcome!

As Accuracy is not a proper scoring rule, maybe look into [tag:scoring-rules] instead, for instance https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models — kjetil b halvorsen, Nov 17 '21 at 03:23
@kjetilbhalvorsen This question is now over two and a half years old, and the paper I was working on is now published (https://www.nature.com/articles/s41598-020-74486-y.pdf). I think sticking with accuracy was the better move, as the majority of the paper's audience is not ML-literate and it would not have been a sufficient benefit for the cost of complexity. — Mark Miller, Nov 18 '21 at 04:16

What measures of an ML algorithm's 'accuracy' are mostly consistent as the number of classes to predict into varies?

0 Answers0