I should develop a network that can read the result of throwing a dice. I have a dataset which consists on a synthetic collection of such images, together with the corresponding target values. Each image has size 32x32 and is represented as a vector of length 32x32=1024.
So far, I've created a training set by selecting at random 820 images with target 1, 820 with target 2, etc. (The whole data set contains 6x8200 images.)
Then I tried to follow this DBN implementation I found on the readme of this github , as follows:
rand('state',0)
%train dbn
dbn.sizes = [1000 1000 1000];
opts.numepochs = 5;
opts.batchsize = 120;
opts.momentum = 0;
opts.alpha = 1;
dbn = dbnsetup(dbn, train_x, opts);
dbn = dbntrain(dbn, train_x, opts);
%figure; visualize(dbn.rbm{1}.W'); % Visualize the RBM weights
%unfold dbn to nn
nn = dbnunfoldtonn(dbn, 6);
nn.activation_function = 'sigm';
%train nn
opts.numepochs = 5;
opts.batchsize = 120;
[nn,L] = nntrain(nn, train_x, train_y, opts);
%[er, bad] = nntest(nn, train_x, train_y);
nn.testing = 1;
nn = nnff(nn, test_x, zeros(size(test_x,1), nn.size(end)));
nn.testing = 0;
[dummy, i] = max(nn.a{end},[],2);
labels = i;
[dummy, expected] = max(test_y,[],2);
bad = find(labels ~= expected);
er = numel(bad) / size(test_x, 1);
net=nn.a{end}';
numRight = size(test_y, 1) - numel(bad');
fprintf('Accuracy: %.2f%%\n', numRight / size(test_y, 1) * 100);
The configuration was based on the one used in Master Thesis of Rasmus Berg Palm on page 22, even though it's using a different dataset.
I've used this implementation before on the dataset used in the thesis (the MNIST one) and I obtained good results, but for this problem I'm getting a strange result that the predicted results are almost always the same, so let's say it predicts all test elements as number 2, or all test elements as number 5, etc.
I believe it's because of my configuration, and not a problem with my implementation or anything code-wise (but maybe it is). Is there anything obviously wrong with these parameters values that I should change?