I have data which is a natural fit for Poisson regression but I'm not sure how to correctly discretize the data into bins in a canonical or "best" way. One intuition is that I should use "small" units of time such that in each bin there is either zero or one event but I'd like to know if there is any mathematical justification for this. Here is a very simple example to illustrate my confusion regarding binning the event data.
Consider the following data and two possible Poisson arrival models (the numerical values for the models are the expected arrivals, "lambda"):
t = 0, data = 56, model_1 = 54, model_2 = 40
t = 1, data = 40, model_1 = 38, model_2 = 56
t = 2, data = 24, model_1 = 26, model_2 = 10
t = 3, data = 8, model_1 = 10, model_2 = 26
A few simple calculations show that:
1. model_1 > model_2 when aggregated into one interval
2. model_2 > model_1 when aggregated into two intervals
3. model_1 > model_2 when aggregated into all four intervals
I think one could construct examples of arbitrary depth of switching in this fashion.
Questions:
What is the proper way to compare two poisson regression models on a test set? What is the canonical way to discretize the event process for said comparison? What is the mathematical justification of said "canonical way" (if it exists)? Is the discretization whereby each bin contains only zeros or one events in any way either canonical or preferred, and if so, what is the mathematical reasoning?