2

Imagine an experimental setup as follows: 30 subjects each make three preference judgements: they choose a color for each button shape. So, the raw subject data could look like:

      Shape   Color
Sub1    Tri     Red
Sub1    Cir   Green
Sub1    Squ   Green
Sub2    Tri    Blue

...etc. You could aggregate the data to get something like this:

    Red  Green  Blue
Tri  30      0     0
Cir  10     10    10
Squ  10     20     0

Now, to me, it is incorrect to analyze this table using chi-square, because the cells are dependent in the sense that the row totals are fixed (e.g. 20 red, 8 green, means that blue can only be 2).

The hypothesis I want to falsify is sort of about marginal homogeneity: are certain shapes preferred to be certain colors? What test would I use to examine this?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Nicholas Root
  • 213
  • 1
  • 10
  • 2
    Personally, I would use multinomial regression, which can be conveniently implemented using a Poisson log-link glm and including a row factor in the linear model to condition on the row totals (sometimes called the "Poisson trick"). BTW, fixed row totals is not a problem for Pearson's chi-squared test because it effectively conditions on the row and column totals. The more serious reasons to avoid the chi-squared tests are the three zero counts and the fact that chi-squared test only tests one overall hypothesis rather than breaking it down into individual comparisons. – Gordon Smyth Sep 25 '19 at 22:37
  • 1
    The simplest method though would be to compare one colour to the two others (e.g., red vs others) and do pairwise exact binomial tests to compare red preference between the shapes. Then do the same for green vs the others and blue vs the others. – Gordon Smyth Sep 25 '19 at 22:41
  • 1
    I agree within-subject dependence violates chi-squared model assumptions. In order to take this into account in a multinomial regression, you'd probably need a random effect. This... https://stats.stackexchange.com/questions/395114/random-effects-for-a-mixed-multinomial-logistic-regression-in-r ...seems very related. – Christian Hennig Sep 25 '19 at 23:28
  • 1
    Right, it's really a case of multinomial regression, unless you want to condense the categories, or ignore the dependence. – Sal Mangiafico Sep 25 '19 at 23:29

0 Answers0