I'm using the nnet package in R to attempt to build an ANN to predict real estate prices for condos (personal project). I am new to this and don't have a math background so please bare with me.
I have input variables that are both binary and continuous. For example some binary variables which were originally yes/no were converted to 1/0 for the neural net. Other variables are continuous like Sqft
.
I have normalized all values to be on a 0-1 scale. Maybe Bedrooms
and Bathrooms
shouldn't be normalized since their range is only 0-4?
Do these mixed inputs present a problem for the ANN? I've gotten okay results, but upon closer examination the weights the ANN has chosen for certain variables don't seem to make sense. My code is below, any suggestions?
ANN <- nnet(Price ~ Sqft + Bedrooms + Bathrooms + Parking2 + Elevator +
Central.AC + Terrace + Washer.Dryer + Doorman + Exercise.Room +
New.York.View,data[1:700,], size=3, maxit=5000, linout=TRUE, decay=.0001)
UPDATE: Based on the comments below regarding breaking out the binary inputs into separate fields for each value class, my code now looks like:
ANN <- nnet(Price ~ Sqft + Studio + X1BR + X2BR + X3BR + X4BR + X1Bath
+ X2Bath + X3Bath + X4bath + Parking.Yes + Parking.No + Elevator.Yes + Elevator.No
+ Central.AC.Yes + Central.AC.No + Terrace.Yes + Terrace.No + Washer.Dryer.Yes
+ Washer.Dryer.No + Doorman.Yes + Doorman.No + Exercise.Room.Yes + Exercise.Room.No
+ New.York.View.Yes + New.York.View.No + Healtch.Club.Yes + Health.Club.No,
data[1:700,], size=12, maxit=50000, decay=.0001)
The hidden nodes in the above code are 12, but I've tried a range of hidden nodes from 3 to 25 and all give worse results than the original parameters I had above in the original code posted. I've also tried it with linear output = true/false.
My guess is that I need to feed the data to nnet in a different way because it's not interpreting the binary input properly. Either that, or I need to give it different parameters.
Any ideas?