Problem : How is a $W_i$ calculated when using Xavier initialization?
From what I understand, the Xavier initialization calculate de sttdev, but Im not sure how it uses that for calculating a specific weight value.
According to the references, $W$ is the "initialization distribution for the neuron in question", what does that mean ? How does that even decide what the value will be?
For a current Layer, let $s$ be the output connections of the layer and $e$ the input connections, then: $f(W) = \frac{2}{e + s}$
References:
http://philipperemy.github.io/xavier-initialization/
http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization
https://prateekvjoshi.com/2016/03/29/understanding-xavier-initialization-in-deep-neural-networks/