I am trying to create a regression model for this variable (Y
) based on 2 categorical variables. So, I created dummy variables to replace them. These dummy variables (i.e. int_collab, Q1,Q2,Q3
) have values of 1 and 0.
Y
is a double, with values ranging from 0 to 348.19, and about 10% of it has values of 0. It follows Poisson distribution.
But, when I model it:
glm(Y ~ int_collab + Q1 + Q2 + Q3, data = capdata,
family=poisson(link="log"))
It returned INF for the AIC value. I am guessing something is wrong? Is it because i am not supposed to use Poisson to model this variable?
I have been reading online, but it confused me even further. It seems i should consider using either Negative Binomial or Quasi Poisson (as the variance greater than mean), etc ... Any pointer on which distribution would be more suitable would be much appreciated!