Why not use more thresholds? Like:
- If probability < 0.25 then class = "weak"
- If 0.25 <= probability <= 0.75 then class = "mediocre"
- If probability > 0.75 then class = "strong"
But remember that if your interest is to predict correctly (in this case classify) new observations, you won't be able to make a comparison between the truth (2 classes) and the predictions (3 classes).
The predictions labels must always be the same as the truth labels in a classification problem.
If you accept the fact that your model is not perfect, you can use still the probabilities estimated as a "score" for each observation. And use the 3-class definition above. But you won't be able to say, for example that the model as an accuracy of 80%, because of the different number of labels.