0

In initialising a deep network before training, what statistical property of gradients and of activations is desirable?

0 Answers0