I'm currently working on a regression problem, where the targets are from 0 to 1. Which would be the best pair of activation and loss function for these kinds of problems?
The ones that I have considered are:
- Linear and L2 loss: L2 loss may lead to vanishing problems when the targets are small (like smaller than 0.1).
- Sigmoid and L1 loss: Should I use sigmoid for a regression problem? I'm afraid it is only suitable when outputs are either 0 or 1.
- Linear and L1 loss: L1 loss may not be able to deal with small difference between outputs and targets. I also heard that models with L1 loss are difficult to converge.
Is there any other activation and loss function that I may use? My experience is limited.