How to modify an NN's loss and/or optimizer for regression where dataset is mostly 0?

Question

Currently, I am using a U-Net with skips to predict images. These images are based on data from 30 minutes prior. Most of the true image is filled with 0, with a range of approximately [0,50]. The network predicts low, non-zero values everywhere, since this apparently is not heavily penalized by the loss function. I am trying to figure out how to modify the network as to get around this problem, by creating a custom loss function. Or some other way?

I am also working on modifying the data, so that I can use a classification scheme instead of regression. However, I would really prefer to use regression here. I am also experimenting with normalizing the data between [0, 1].

Further info: For regression, I am using MSE as loss. The network appears to learn on the training data fairly well (as training loss falls), but reaches a limit on validation data. No form of regularization has caused the network to cease overfitting. I'ved used L2 reg, L1 reg, and dropout.

Have you tried scaling the targets to $[0,1]$ and using binary cross-entropy loss? Essentially, this makes the targets soft binary labels; this scheme can work better for some image tasks like this (canonically, MNIST autoencoders). — Sycorax, Jul 08 '21 at 15:48
That is one of the tasks I'm working on; pretty big dataset takes some time. However, this would turn into a binary classification problem that way, right? — mmont, Jul 08 '21 at 15:51
I don't see it as binary classification. You can take many values in the interval $[0,1]$, not just the $0$ and $1$. — Dave, Jul 08 '21 at 15:52
No, it wouldn't; it's still a regression problem. You'd be using soft labels with the goal of predicting values in [0,1]. — Sycorax, Jul 08 '21 at 15:52
Oh, I see. I was under the impression that binary crossentropy was used only for binary classification. I'll try this next, then. The metric, then, wouldn't be binary accuracy, right? — mmont, Jul 08 '21 at 15:56
Even in a binary classification task, accuracy is not very informative. https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models/312787#312787 You can still measure the MSE or MAE or whatnot, but they just wouldn't be used for the loss or backprop computations. — Sycorax, Jul 08 '21 at 16:01
And what @Sycorax writes is true, even for balanced classes. — Dave, Jul 08 '21 at 16:09
@Sycorax this method doesn't appear to be working much better. It still has that shmeared quality, I think because the input image is sometimes shmeared that way. But it appears to over-generalize that feature to all images... In the case that this doesn't work, are there other ideas I might look into? — mmont, Jul 08 '21 at 18:37
https://stats.stackexchange.com/questions/365778/what-should-i-do-when-my-neural-network-doesnt-generalize-well — Sycorax, Jul 08 '21 at 19:10

How to modify an NN's loss and/or optimizer for regression where dataset is mostly 0?

0 Answers0