5

I am looking to compare the fit of a zero-inflated mixture model and a Poisson mixture model. The random effects in both models are different.

Comparing the fitted values of both models ignores the complexity of the models, and using model selection methods such as DIC, AIC, etc., is not a straightforward exercise due to the difference in mixture distributions, so I am wondering if there is a better way?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Rob
  • 51
  • 2

1 Answers1

4

Probably what you will need to use is the Parametric Bootstrap Cross-fitting Method. Here is the basic procedure:

  1. Fit each model to the data. Estimate the models' parameters and extract your favorite measure of goodness of fit. We will call the model with the higher value for this GoF measure $A$ and the other model $B$. Calculate the difference $d$ between the two measures of GoF, and store that value. (Be sure you are clear about whether higher or lower numbers of your GoF measure indicate a better fit--i.e., whether $d>0$ implies $A$ is better or worse.)
  2. Using the fitted parameters for $A$ from step 1, generate a large number of synthetic datasets (say 1000). With each of these datasets, fit both of your models, extract their GoF measures, compute $d$ and store it.
  3. Using the fitted parameters for $B$ from step 1, generate another set of (1000) synthetic datasets. With these datasets, again fit your models, compute the $d$s and store them.
  4. You now know what the sampling distribution of $d$ looks like when the true model is $A$ and when the true model is $B$. Determine the cutpoint, $d_\text{cut}$, that optimally differentiates between the models. If you want, you can bring prior knowledge to bear by differentially weighting the alternatives.
  5. Compare your found $d$ from step 1 to $d_\text{cut}$ and select the corresponding model.

I demonstrate this approach here. (There is another description of PBCM in this answer: Measures of model complexity.) Here is the reference:

  • Wagonmakers, E.J., Ratcliff, R., Gomez, P., & Iverson, G.J. (2004). Assessing model mimicry using the parametric bootstrap, Journal of Mathematical Psychology, 48, pp. 28-50. (pdf)
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • I don't understand how this would help here. Model mimcry tells you how much each model can mimic the other but is as far as I understand not helpful in determining model fit. – Henrik Dec 08 '13 at 17:49
  • Perhaps I'm misunderstanding something, @Henrik. It seems to me the question is how to compare the fit of two models that are not nested, differ in complexity, & in structure / nature. As a result of these facts, raw measures of fit can't tell you which model is better, & you cannot perform a nested model test or use information criteria (see #2 in my answer [here](http://stats.stackexchange.com/questions/22902//22905#22905)) to select 1 of the models. You can always use PCBM to compare the fit of 2 models, though. – gung - Reinstate Monica Dec 08 '13 at 18:00
  • Hmmm... re-reading your answer, I see that you focus on the question of how to determine which model is more complex. I am thinking in terms of how to incorporate model complexity in the process of using goodness of fit to select a model. This would be the "data informed" version. – gung - Reinstate Monica Dec 08 '13 at 18:05
  • Yep, that is the problem. Whatever comes out form the model mimicry analysis can not be combined with the fit values (in the AIC sense). You could only show that one model is in the mimicry sense more flexible than the other. But if this is counter to your fits it would help the OP. So perhaps it is something to use given the lack of alternatives so far. (+1) – Henrik Dec 08 '13 at 18:10