Because most examples in your training set are not of the diving action. This leads to class imbalance, and accuracy is a mostly meaningless metric in the presence of significant class imbalance. Here is your confusion matrix:
P | N | Total
------------------------+-----+------
Prediction Positive 9 | 4 | 13
Prediction Negative 1 | 109 | 110
------------------------+-----+------
Total 10 | 113 | 123
Note that 113/123 of your actions are not diving, while only 10 are. That means that even if I just guessed "not diving" every time, I would be right 92% of the time! That means that 92% accuracy is the "floor" for any non-trivial model. A model would have to be truly broken to get less than that! Sure, your model gets 96% accuracy, but it's really only edging out the constant model by 4%. The moral of the story is that accuracy is basically the wrong metric to look at in the presence of class imbalance, not only because it will be inflated, but also because it will be inflated by a different amount depending on prevalence, making it very difficult to compare performance across difference classes (actions in your case.)
AUC ROC is a widely used metric in these cases, and one of the reasons is that it is insensitive to changes in class prevalence. Another metric (less commonly used but closer in spirit to accuracy) is Matthews correlation coefficient. The $F_\beta$ score (for a suitable choice of $\beta$ that incorporates your own priorities) is also a good choice. Any of these should be more useful and easier to interpret than accuracy for your case.