3

I have the code below where a simple rule based classification data set is formed:

# # Data preparation
data = data.frame(A = round(runif(100)), B = round(runif(100)), C = round(runif(100)))
# Y - is the classification output column
data$Y = ifelse((data$A == 1 & data$B == 1 & data$C == 0), 1, ifelse((data$A == 0 & data$B == 1 & data$C == 1), 1, ifelse((data$A == 0 & data$B ==0 & data$C == 0), 1, 0)))
# Shuffling the data set
data = data[sample(rownames(data)), ]

I have divided the data set into training and testing so that I can validate my results on the test set. Once it is done I have tried building a simple neural-net with the number of neurons in hidden layer is chosen by looping (as mentioned here)

i.e. $$N_h = \frac{N_s} {(alpha * (N_i + N_o))}$$ where,

$N_i$ = number of input neurons.
$N_o$ = number of output neurons.
$N_s$ = number of samples in training data set.
$alpha$ = an arbitrary scaling factor usually 2-10.

Code attached here - It was giving poor over fitted results. But, when I have built a simple random forest on the same data set. I am getting the train and test errors as - $0$

Please help me in understanding why neural nets are failing in a simple case where random forest is working with $100-Percent$ Accuracy.

Note: I have used only one hidden layer (assuming one hidden will be enough to solve simple classification problems) and iterated on the number of neurons in the hidden layer.

Also, help me in choosing better parameters for the neural nets so that it can classify the data better.

Kartheek Palepu
  • 355
  • 1
  • 3
  • 14
  • 1
    I have edited the question making it more of a statistical question. Please help me if I need to make any more further changes. I would love to know the answer. – Kartheek Palepu Jul 25 '16 at 07:30

1 Answers1

0

I have found the answer to my question here.

Actually it was a small bug in my code.

We just need to replace the below line

testPred = round(compute(nn, test[-length(ncol(test))])$net.result)

with

testPred = round(compute(nn, test[ncol(test)])$net.result)
Kartheek Palepu
  • 355
  • 1
  • 3
  • 14