I'm trying to implement drop connect. Am I supposed to use the same drop masks during back propagation?
Asked
Active
Viewed 146 times
1 Answers
1
Yes, because back propagation is for computing gradients.
If some connection is blocked by the mask, it contributes nothing to the loss, so its associated gradient should be zero.

dontloo
- 13,692
- 7
- 51
- 80