Model with truncated count response

Question

I am trying to select the correct model for analyzing data from a behavioral sciences experiment. The experiment consists of eight trials for each participant, and each experiment generates a score on the interval [0--8] for each participant. Each trial within the experiment can be conceived of as a binomial choice. The ideal model will give a prediction equation for the estimated score, using values for a small number of binary categorical explanatory variables.

I am inclined to assume a Poisson distribution for the response variable, with a log link, or a negative binomial distribution if overdispersion is an issue. However, the true distribution is truncated, since any score above 8 is not possible.

Is such an approach on the right track? Would it be wiser to treat each trial as a separate outcome, then do a binomial regression with a random effect for participant? (I have seen a similar question on this site, but would appreciate any advice beyond what is given there)

Is the response variable continuous? If not, are the numbers cardinal (like counts), or ordinal (in order, like a Likert response scale), or just numbers assigned to different responses that aren't ordered? — jbowman, Feb 29 '12 at 03:06
@jbowman The numbers represent the number of times a particular response type is observed, so they are cardinal. — , Feb 29 '12 at 03:24
@jbowman To give more context, each participant is asked to arrange eight rubber toys on a table; there are a small number of possible strategies for arranging each toy. The response variable is the number of animals for which a particular type of strategy is employed. — , Feb 29 '12 at 12:26

score 1 · Accepted Answer · answered Feb 29 '12 at 12:56

1

I'd suggest using a generalized linear model for binomial data, i.e. grouped binary data, with the score 0-8 as the outcome and 8 as the binomial denominator.

As the 8 trials for the same participant aren't independent it's very possible you'll have over- (or under-) dispersion. The simplest way to deal with that within a GLM is to estimate the scale parameter from the data rather than fixing it at the theoretical value.

answered Feb 29 '12 at 12:56

onestop

16,816
2
53
83

Thanks, this is helpful. I think this would mean that the logit for a response of 5 (for example) would be 5/(8-5) = 5/3 (the binomial denominators cancel out). – Feb 29 '12 at 15:54
I mean log(5/3) – Feb 29 '12 at 16:03
Yes, but this should be calculated by your statistical software rather than you.. – onestop Feb 29 '12 at 16:41

Model with truncated count response

1 Answers1