Highest Voted 'weight-initialization' Questions - Statistical Analysis Stack Exchange

4

votes

0 answers

Are Batch Normalization and Kaiming Initialization addressing the same issue (Internal Covariate Shift)?

In the original Batch Norm paper (Ioffe and Szegedy 2015), the autors define Internal Covariate Shift as the "the change in the distributions of internal nodes of a deep network, in the course of training". They then present Batch Norm as a solution…

asked Dec 27 '21 at 14:42

thesofakillers

41
2

1

vote

0 answers

Guide to self-starter estimators (parameter initialization) for "simple" functions

Background I have a collection of functions with trainable parameters that I am implementing as Keras model classes, which enables immediate use of a variety of objective functions, optimizers, and training-related methods (e.g. early stopping…

references parameterization weight-initialization

asked Nov 24 '21 at 16:44

DifferentialPleiometry

2,274
1
11
27

1

vote

1 answer

Output of ANN with zero initialized weights represents what?

In class we discussed that if the weights of an ANN (standard feed forward NN in binary classification setting [0,1]) are initialized all at zero, the ANN fails to break symmetrie and therefore, the units in each layer develop equivalently. My…

machine-learning neural-networks symmetry weight-initialization

asked Jul 23 '21 at 08:43

J3lackkyy

535
1
9

0

votes

0 answers

Correct weight initialization for pre-activation convolutions & pre-activation depthwise separable convolutions?

My CNN architecture use pre-initialization, i.e. BatchNorm -> ReLU -> Conv. Which weight initialization shall I use for the convolutions? I'm under the impression that the standard ReLU initialization scheme of HeNormal is designed for Conv -> ReLU…

machine-learning neural-networks conv-neural-network weight-initialization

asked Jun 18 '21 at 12:04

Avelina

809
1
12

Questions tagged [weight-initialization]

Are Batch Normalization and Kaiming Initialization addressing the same issue (Internal Covariate Shift)?

Guide to self-starter estimators (parameter initialization) for "simple" functions

Output of ANN with zero initialized weights represents what?

Correct weight initialization for pre-activation convolutions & pre-activation depthwise separable convolutions?