Neural network "classifiers" output probability scores, and when they are optimized via crossentropy loss (common) or another proper scoring rule, they are optimized in expectation by the true probabilities of class membership.
However, I have read on Cross Validated and perhaps elsewhere that neural networks are notorious for being overly confident. That is, they will be happy to predict something like $P(1) = 0.9$ when they should be predicting $P(1) = 0.7$, which still favors class $1$ over class $0$ but by less.
If neural networks are optimizing a proper scoring rule like crossentropy loss, how can this be?
All that comes to mind is that the model development steps optimize improper metrics like accuracy. Sure, the model in cross validation is fitted to the training data using crossentropy loss, but the hyperparameters are tuned to get the highest out-of-sample accuracy, not the lowest crossentropy loss.
(But then I figure that the model would be less confident in its predictions. Why be confident in your prediction when you get the right classification with a low-confidence classification like $0.7$ than a high-confidence classification like $0.9$?)