Why is my DBN predict only 2 out of 5 classes?

Question

I'm using the Deeplearning.net DBN tutorial to train my data set. I normalize the feature set to zero-mean-unit-variance. However, I can only get the network to predict 2 out 5 classes even though the data have all 5 classes distributed rather evenly.

I tried some variations such as different hidden layer depth / sizes, different back-prop depth. I may get the two predicted classes different (e.g. 1 and 4) but still only 2 classes are predicted. What could go wrong? Also noticed that the output of the first hidden layer are strictly 1s and 0s; does it indicate something?

(In comparison, MLP predicts all classes)

DBN results:

test_DBN(dataset=ds, hidden_layers_sizes=[1000, 1000, 100], pretraining_epochs=5, training_epochs=1000, batch_size=50)

predicting results for test set
hidden layer input __str__ = [[-0.51135188 -0.50633603 -0.50996745 ..., -0.50814176 -0.4967888
  -0.49966851]
 [-0.59209806 -0.58584201 -0.58815384 ...,  1.96783125 -0.4967888
  -0.49966851]
 [-0.56051141 -0.56283319 -0.56722915 ..., -0.50814176 -0.4967888
  -0.49966851]
 ..., 
 [-0.37065783 -0.36497939 -0.36608201 ..., -0.50814176  2.01297021
  -0.49966851]
 [-0.57074374 -0.57057875 -0.56801659 ..., -0.50814176 -0.4967888
  -0.49966851]
 [-0.37277099 -0.36748531 -0.36866945 ..., -0.50814176 -0.4967888
  -0.49966851]]
hidden layer input __str__ = [[ 1.  0.  0. ...,  1.  1.  1.]
 [ 1.  1.  1. ...,  1.  1.  1.]
 [ 1.  0.  0. ...,  1.  1.  1.]
 ..., 
 [ 0.  1.  1. ...,  0.  0.  0.]
 [ 1.  0.  1. ...,  1.  1.  1.]
 [ 0.  1.  1. ...,  0.  0.  0.]]
hidden layer input __str__ = [[  9.99524236e-01   1.67467550e-11   9.19133151e-13 ...,   4.11743919e-08
    2.21909352e-10   9.99998987e-01]
 [  9.99984503e-01   1.42381011e-11   4.89908288e-12 ...,   5.71135916e-10
    1.36941370e-11   9.99997199e-01]
 [  9.99506652e-01   1.33770659e-11   5.76966534e-13 ...,   7.96842485e-08
    4.56842064e-10   9.99999166e-01]
 ..., 
 [  9.53848839e-01   5.07281601e-01   8.29560697e-01 ...,   2.49304697e-02
    1.01453438e-01   2.21684471e-01]
 [  9.99396384e-01   1.12449763e-11   9.92904760e-13 ...,   8.20390369e-08
    5.41795719e-10   9.99999166e-01]
 [  9.53848839e-01   5.07281601e-01   8.29560697e-01 ...,   2.49304697e-02
    1.01453438e-01   2.21684471e-01]]
logistic regression input __str__ = [[  9.99537826e-01   1.45091311e-04   1.00000000e+00 ...,   5.16466935e-05
    6.20799938e-07   2.51326132e-15]
 [  9.99747097e-01   1.07736792e-04   1.00000000e+00 ...,   4.25282269e-05
    7.78833453e-07   4.57793441e-15]
 [  9.99485135e-01   1.51106768e-04   1.00000000e+00 ...,   5.23381459e-05
    5.98172107e-07   2.19557864e-15]
 ..., 
 [  1.00000000e+00   1.00000000e+00   1.58925980e-11 ...,   6.89105073e-05
    5.14958592e-05   1.00000000e+00]
 [  9.99469340e-01   1.53716363e-04   1.00000000e+00 ...,   5.22819246e-05
    5.94462563e-07   2.14297620e-15]
 [  1.00000000e+00   1.00000000e+00   1.58925980e-11 ...,   6.89105073e-05
    5.14958592e-05   1.00000000e+00]]
# predictions
[4 4 4 0 4 4 4 0 4 4 4 4 4 4 4 0 0 0 4 4 0 4 4 0 4 4 4 4 0 0 0 4 0 4 0 4 0
 4 4 4 4 4 4 4 4 4 4 0 4 4 4 0 4 4 4 4 0 4 4 4 4 0 4 4 4 4 4 4 4 4 4 4 4 4
 ...
]

MLP results

Optimization complete. Best validation score of 30.000000 % obtained at iteration 26300, with test performance 69.125000 %
predicting results for test set
[0 1 0 1 1 1 1 1 1 2 1 1 4 1 4 0 1 0 2 4 1 2 1 3 1 3 1 0 1 1 1 1 3 1 0 1 3
 1 1 3 1 0 1 1 0 1 0 3 1 1 1 1 1 1 4 1 0 1 1 4 4 3 2 4 1 2 1 0 0 1 1 2 4 1
 ...
]

What is your data? Did you encounter the same problem with MNIST? zeros and ones in the hidden layer output is not necessarily a problem in the case of RBM. Some type of them only outputs 0 or 1. — yasin.yazici, May 28 '15 at 10:35
Their out-of-box MNIST example works fine with minimal errors. My datasets are time-series data in a fixed time window. Does it help identifying the cause? — teddy, May 28 '15 at 14:47
The other thing I notice is that the pre-training cost of each RBM layer is quite high, especially the first layer. While other RBM layers' cost go down if the number of pre-training epoch increase, the first layer's cost seems going up as the pre-training epoch increase. Why is that? Pre-training layer 0, epoch 0, cost 119185.0 Pre-training layer 0, epoch 1, cost 357330.0 Pre-training layer 1, epoch 0, cost -420.976 Pre-training layer 1, epoch 1, cost -230.627. — teddy, May 28 '15 at 15:00
Did you inspect the weights' evolution? Exploding cost could be linked to weights exploding. You might want to introduce regularization and search for better initial weights. — Jenkar, Sep 19 '16 at 07:34

Why is my DBN predict only 2 out of 5 classes?

0 Answers0