Apologies in advance, my background is in CS rather than statistics, so I may use some terms improperly.
I have a data set of a few thousand items scored on a 3 point scale, which my research indicated may be a Likert scale. A rating is a choice of "Not Recommended" / "Recommended" / "Best". Not every item has been rated the same number of times, some may have dozens of ratings and others will be in the high hundreds.
What method(s) can I use to rank all the items in my data set? By rank, I mean in the sense of having a #1, #2, #3, etc. I believe this would be an ordinal ranking? I have no preference for whether ties are allowed. I've found the Mann-Whitney and Wilcoxon tests but I'm not sure if they're relevant or applicable.
Is there a "right" way to blend these scores into a single (weighted?) number? What kind of penalty should I apply to items with a relatively small number of ratings?
My first thought is to apply the formula [(W1*N1) + (W2*N2) - (W3*N3)] / (N1 + N2 + N3)
, where Wx is the label weight and Nx is the number of times that item has received that label. Then I kicked myself for playing armchair statistician. I'm assuming this is a common problem (thumbs up/thumbs down, five star product rating, etc) with a common method. Unfortunately I don't know enough about the problem domain to find the answer, hence my post here.