I am writing my master thesis right now doing a project in deep learning doing semantic segmentation of MRI-images. Me and my partner have been looking at using dice loss instead of categorical cross-entropy. Because it is stated in a couple of papers that you might get better results on the segmentation task.
In the thread Dice-coefficient loss function vs cross-entropy It is however stated that this is not necessarily true and that one has to test this statement empirically.
I have been staring at the equation for dice loss for quite some time now
And I do not understand why "one does not have to assign weights to samples of different samples to establish the right balance" or "In addition, Dice coefficient performs better at class imbalanced problems by design"
If anyone could help me getting a better intuition why dice loss is better than cross-entropy for class imbalanced problems I would be super happy.
Just as an extra in this paper they introduced a "generalized dice loss" where each class is scaled with a weight parameter which is inversely proportional to the number of voxel belonging to this class. In this case I absolutely understand how this combats class imbalance. https://arxiv.org/pdf/1707.03237.pdf