I am training a 6 layer Deep Neural Network with:
<bound method Module.parameters of Model2(
(layer1): Linear(in_features=4800, out_features=8000, bias=True)
(layer2): Linear(in_features=8000, out_features=5000, bias=True)
(layer3): Linear(in_features=5000, out_features=2000, bias=True)
(layer4): Linear(in_features=2000, out_features=200, bias=True)
(layer5): Linear(in_features=200, out_features=20, bias=True)
(layer6): Linear(in_features=20, out_features=52, bias=True)
Inputs are images in size 60 * 80. I am using relu activation function, Cross-Entropy for loss function, and Stochastic Gradient Descent. I used the weight-decay parameter(i.e. L2 regularization method). I don't face overfitting but the accuracy was 61 % and now is 26%!
Can anyone explain the reason?
(I assigned hyperparameter of regularization to 0.01. I changed it but no improvement in the accuracy!)