I'm using the Deeplearning.net DBN tutorial to train my data set. I normalize the feature set to zero-mean-unit-variance. However, I can only get the network to predict 2 out 5 classes even though the data have all 5 classes distributed rather evenly.
I tried some variations such as different hidden layer depth / sizes, different back-prop depth. I may get the two predicted classes different (e.g. 1 and 4) but still only 2 classes are predicted. What could go wrong? Also noticed that the output of the first hidden layer are strictly 1s and 0s; does it indicate something?
(In comparison, MLP predicts all classes)
DBN results:
test_DBN(dataset=ds, hidden_layers_sizes=[1000, 1000, 100], pretraining_epochs=5, training_epochs=1000, batch_size=50)
predicting results for test set
hidden layer input __str__ = [[-0.51135188 -0.50633603 -0.50996745 ..., -0.50814176 -0.4967888
-0.49966851]
[-0.59209806 -0.58584201 -0.58815384 ..., 1.96783125 -0.4967888
-0.49966851]
[-0.56051141 -0.56283319 -0.56722915 ..., -0.50814176 -0.4967888
-0.49966851]
...,
[-0.37065783 -0.36497939 -0.36608201 ..., -0.50814176 2.01297021
-0.49966851]
[-0.57074374 -0.57057875 -0.56801659 ..., -0.50814176 -0.4967888
-0.49966851]
[-0.37277099 -0.36748531 -0.36866945 ..., -0.50814176 -0.4967888
-0.49966851]]
hidden layer input __str__ = [[ 1. 0. 0. ..., 1. 1. 1.]
[ 1. 1. 1. ..., 1. 1. 1.]
[ 1. 0. 0. ..., 1. 1. 1.]
...,
[ 0. 1. 1. ..., 0. 0. 0.]
[ 1. 0. 1. ..., 1. 1. 1.]
[ 0. 1. 1. ..., 0. 0. 0.]]
hidden layer input __str__ = [[ 9.99524236e-01 1.67467550e-11 9.19133151e-13 ..., 4.11743919e-08
2.21909352e-10 9.99998987e-01]
[ 9.99984503e-01 1.42381011e-11 4.89908288e-12 ..., 5.71135916e-10
1.36941370e-11 9.99997199e-01]
[ 9.99506652e-01 1.33770659e-11 5.76966534e-13 ..., 7.96842485e-08
4.56842064e-10 9.99999166e-01]
...,
[ 9.53848839e-01 5.07281601e-01 8.29560697e-01 ..., 2.49304697e-02
1.01453438e-01 2.21684471e-01]
[ 9.99396384e-01 1.12449763e-11 9.92904760e-13 ..., 8.20390369e-08
5.41795719e-10 9.99999166e-01]
[ 9.53848839e-01 5.07281601e-01 8.29560697e-01 ..., 2.49304697e-02
1.01453438e-01 2.21684471e-01]]
logistic regression input __str__ = [[ 9.99537826e-01 1.45091311e-04 1.00000000e+00 ..., 5.16466935e-05
6.20799938e-07 2.51326132e-15]
[ 9.99747097e-01 1.07736792e-04 1.00000000e+00 ..., 4.25282269e-05
7.78833453e-07 4.57793441e-15]
[ 9.99485135e-01 1.51106768e-04 1.00000000e+00 ..., 5.23381459e-05
5.98172107e-07 2.19557864e-15]
...,
[ 1.00000000e+00 1.00000000e+00 1.58925980e-11 ..., 6.89105073e-05
5.14958592e-05 1.00000000e+00]
[ 9.99469340e-01 1.53716363e-04 1.00000000e+00 ..., 5.22819246e-05
5.94462563e-07 2.14297620e-15]
[ 1.00000000e+00 1.00000000e+00 1.58925980e-11 ..., 6.89105073e-05
5.14958592e-05 1.00000000e+00]]
# predictions
[4 4 4 0 4 4 4 0 4 4 4 4 4 4 4 0 0 0 4 4 0 4 4 0 4 4 4 4 0 0 0 4 0 4 0 4 0
4 4 4 4 4 4 4 4 4 4 0 4 4 4 0 4 4 4 4 0 4 4 4 4 0 4 4 4 4 4 4 4 4 4 4 4 4
...
]
MLP results
Optimization complete. Best validation score of 30.000000 % obtained at iteration 26300, with test performance 69.125000 %
predicting results for test set
[0 1 0 1 1 1 1 1 1 2 1 1 4 1 4 0 1 0 2 4 1 2 1 3 1 3 1 0 1 1 1 1 3 1 0 1 3
1 1 3 1 0 1 1 0 1 0 3 1 1 1 1 1 1 4 1 0 1 1 4 4 3 2 4 1 2 1 0 0 1 1 2 4 1
...
]