Increasing alpha in regularization for consecutive hidden layers

Asked Oct 25 '16 at 20:48

Active Oct 25 '16 at 20:48

Viewed 143 times

I head an idea but couldn't find it on the internet. Is it common or beneficial to increase/decrease alpha (as in alpha*L2_norm) for each consecutive layer of a neural net? For example, when we detect edges in images we don't really need to regularize much since maybe small features (lines, circles etc) are fairly common but big features (specific faces) are not.

asked Oct 25 '16 at 20:48

Mariusz

1

Good question. I actually mentioned this possibility in my answer [here](http://stats.stackexchange.com/questions/236259/applying-l1-l2-and-tikhonov-regularization-to-neural-nets-possible-misconcepti), but had no examples. A recent reference [says](http://www.deeplearningbook.org/contents/regularization.html) "In the context of neural networks, it is sometimes desirable to use a separate penalty with a different α coefficient for each layer of the network." But it does not give any examples. – GeoMatt22 Oct 25 '16 at 21:26

Increasing alpha in regularization for consecutive hidden layers

0 Answers0