Suppose I have a response vector and a factorial design (for simplicity, assume it’s a one-way ANOVA with two treatments). A few Generalized Linear Models (Poisson, Negative Binomial, etc) are fitted to the data. This is done separately for each of K experimental units. Each unit has a distinct response vector, but the design matrix is the same across units.
There are a few ways to decide what particular GLM (i.e. the distribution of response) to use for each unit. E.g. one can say that the choice is determined by AIC separately for each unit. Apparently, using such unit-specific GLM amounts to a higher number of effective parameters, so it may be the case that using the same GLM for all units works better overall.
The problem is that there is no treatment effect in any of the units, so I can’t construct a ROC curve because it requires having both true positives and true negatives, and the former are absent here. What I can and will do is to check whether the test size is preserved (i.e. a “good” strategy should result in calling about 5% of the units when testing for the treatment effect with the cutoff p-value = 0.05).
I am wondering whether it’s possible to perform a more direct, out-of-sample test. Note that the value of predicted response is equal to the corresponding cell mean regardless of what GLM is used. Therefore, I will have to compare not the accuracy of predicting the future response, but the accuracy of predicting the variance of response or some other statistic.
E.g. suppose for each unit I have 20 observations per treatment. I use 10 observations per treatment to fit a few GLMs. Each GLM produces an estimate of response variance in each cell. Using the other half of the sample, I compute the observed variance in each cell and the discrepancy between the observed and predicted values. The discrepancy measures are summed up across all of the units.
If you have seen something like that before, please provide suggestions and references.