The random variable on which I am seeking to fit a GLM is the number of times a patient has a blood glucose level measurement above a specified threshold before they are escalated onto a stronger therapy. My goal is to examine the fitted model to determine if there are any factors which are associated with an increased number of measurements before therapy escalation.
The type of hypothesis I would wish to test is: Being male is associated with having more blood glucose measurements above the escalation threshold compared with females.
Intially I fit a poisson regression model and interpreted the coefficient associated with being male as the average increased number of measurements before escalation associated with gender = male.
However, it seems this process is not strictly a poisson process because I am not measuring the count of events in a fixed time interval. Therefore, because the patients have medical histories of different lengths I have to adjust for this fact by including year of treatment start as predictor variable that is an input into the model.
y = the vector of measurements above threshold counts observed in patients 1 through n X = the matrix of predictors a.k.a. regressors (n x m) where m is the number of explanatory variables including gender and year of treatment start.
I did consider using a negative binomial GLM which would correct for the fact that the variance does not equal the mean for this random variable but is there a better technique?