I am looking into VGGNet. The networks are structured using Conv, Relu and Pooling layers only. How regularization is done in the VGGNet?
Asked
Active
Viewed 987 times
1 Answers
1
The answer is in the paper you referenced:
The training was regularised by weight decay (the L2 penalty multiplier set to $5\cdot10^{−4}$) and dropout regularisation for the first two fully-connected layers (dropout ratio set to 0.5).
Moreover, using convolutional layers may be considered regularization itself (weight sharing). Also, in the rest of section 3 they discuss impact of weight initialization on the performance, and describe data augmentation by scaling.

Jan Kukacka
- 10,121
- 1
- 36
- 62