1

I am testing a tool that tries to select the correct outcome. I am trying to do significance testing to see if the tool is better then choosing the outcome at random.

It picks from 4 categories, and I have a list of the correct category, and the one the tool picked.

What test should I use?

Adam
  • 13
  • 4

1 Answers1

0

Consider that you're interested in whether the proportion of times correct is greater than the proportion you'd reasonably get under random guessing (presumably with equal probability on each outcome).

With 4 outcomes therefore, the chance you get it by random guessing would be $\frac14$. Assuming independence of trials, the number of correct guesses under the null hypothesis would be $\text{binomial}(n, \frac14)$, where $n$ is the number of trials (attempts at guessing); this leads to a binomial test (see the example at the link involving testing whether a die rolls too many 6's, at heart the same problem as yours with a different number of outcomes).

If $n$ is large you could use a normal approximation (leading to the typical one-sample proportions test covered in a non-mathematical introductory stats text), but more generally you can base a test directly off the binomial.

Presumably from the way the question was phrased, you seek a one-tailed test.

[Alternatively, in place of the normal approximation to the binomial, you could perform a chi-squared goodness of fit test (two outcomes, with probabilities 1/4 and 3/4 under the null), but this would prevent doing a one-tailed test.]

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 1
    It depends on what exactly "random guessing" means. Consider the case where the 4 categories are imbalanced, & you want to include that information. I might see if the OP can form a confusion matrix (cf., [How to calculate information included in R's confusion matrix](https://stats.stackexchange.com/a/253435/7290)). – gung - Reinstate Monica Dec 31 '18 at 03:52
  • I agree with the point (indeed my answer hints at this issue in the first sentence). Nevertheless, I expect (as stated in my answer) that the OP intends the random choice of outcomes to be with equal probability. – Glen_b Dec 31 '18 at 03:54
  • 1
    Presumably yes, but it's worth clarifying (IMO). – gung - Reinstate Monica Dec 31 '18 at 04:08