I have a dataset consisting of about 600 observations. Each observation has around 100 attributes. One of the attributes I want to predict. Since the attribute that I want to predict can only have non-negative integer values, I was looking for ways to predict count data and found that there are various options, such as Poisson regression or negative binomial regression.
For my first try I used negative binomial regression in R
:
#First load the data into a dataset
dataset <- test_observations[, c(5:8, 54)]
#Create the model
fm_nbin <- glm.nb(NumberOfIncidents ~ ., data = dataset[10:600, ] )
I then wanted to see how to predicted values look like:
#Create data to test prediction
newdata <- dataset[1:10, ]
#Do the prediction
predict(fm_nbin, newdata, type="response")
Now the problem is the output looks like this:
1 2 3 4 5 6 7 8 9 10
0.2247337 0.2642789 0.2205408 0.2161833 0.1794224 0.2081522 0.2412996 0.2074992 0.2213011 0.2100026
The problem with this is that I expected that the predicted values are integers, since that is the whole purpose of using a negative binomial regression. What am I missing here?
Furthermore, I would like to evaluate my predictions in terms of mean squared error and mean absolute error, as well as a correlation coefficient. However, I couldn't find a way to get these easily, without doing all the calculations manually. Is there any built-in function for this?