1

I am trying to select the correct model for analyzing data from a behavioral sciences experiment. The experiment consists of eight trials for each participant, and each experiment generates a score on the interval [0--8] for each participant. Each trial within the experiment can be conceived of as a binomial choice. The ideal model will give a prediction equation for the estimated score, using values for a small number of binary categorical explanatory variables.

I am inclined to assume a Poisson distribution for the response variable, with a log link, or a negative binomial distribution if overdispersion is an issue. However, the true distribution is truncated, since any score above 8 is not possible.

Is such an approach on the right track? Would it be wiser to treat each trial as a separate outcome, then do a binomial regression with a random effect for participant? (I have seen a similar question on this site, but would appreciate any advice beyond what is given there)

  • Is the response variable continuous? If not, are the numbers cardinal (like counts), or ordinal (in order, like a Likert response scale), or just numbers assigned to different responses that aren't ordered? – jbowman Feb 29 '12 at 03:06
  • @jbowman The numbers represent the number of times a particular response type is observed, so they are cardinal. –  Feb 29 '12 at 03:24
  • @jbowman To give more context, each participant is asked to arrange eight rubber toys on a table; there are a small number of possible strategies for arranging each toy. The response variable is the number of animals for which a particular type of strategy is employed. –  Feb 29 '12 at 12:26

1 Answers1

1

I'd suggest using a generalized linear model for binomial data, i.e. grouped binary data, with the score 0-8 as the outcome and 8 as the binomial denominator.

As the 8 trials for the same participant aren't independent it's very possible you'll have over- (or under-) dispersion. The simplest way to deal with that within a GLM is to estimate the scale parameter from the data rather than fixing it at the theoretical value.

onestop
  • 16,816
  • 2
  • 53
  • 83
  • Thanks, this is helpful. I think this would mean that the logit for a response of 5 (for example) would be 5/(8-5) = 5/3 (the binomial denominators cancel out). –  Feb 29 '12 at 15:54
  • I mean log(5/3) –  Feb 29 '12 at 16:03
  • Yes, but this should be calculated by your statistical software rather than you.. – onestop Feb 29 '12 at 16:41