I have the same question as in this post: Dropout: scaling the activation versus inverting the dropout but for alpha dropouts: I would like to know if I need to apply the scale factor of $p$ when applying a prediction (not the training)?
Asked
Active
Viewed 354 times