0

Context: Prediction of dependent variables like Age, Siblings, Children, etc (which are not categorical, but bounded, and integer-valued) from a dataset using Neural Network. I'm experimenting with a simple network, and plan to train it using Gradient Descent for starters.

While RMSE is a natural choice for minimization, the predictions currently are float values. I could probably round the predictions (or ceil/floor them) and calculate the loss (again using RMSE), but is that a correct approach?

Ambareesh
  • 41
  • 4

1 Answers1

2

Consider the number of children. If for a given instance, one or two children are equally probable outcomes, then the expectation prediction is 1.5 children. This is what the (R)MSE will pull you towards.

If you want an unbiased prediction of a count (or binary) variable, you will usually get a non-integer value.

Whether this is a problem for you depends on what you plan on doing with your prediction. For instance, you may actually want a quantile prediction, which would indeed be integer.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357