1

I am working with the following data:

Fixed effect:

  • GROUP (each subject only in one group)

  • CONDITION (within subject variable, there are 2 conditions (1) baseline - same for all the groups, and (2) experimental - different for each group)

  • GENDER (of the speaker from the stimuli)

  • GENDER_S (of the subject)

Random Effects:

  • Subject

  • Item (the audio stimuli)

Response Variable: accent ratings converted to z-scores

Research Question: Will the accent ratings differ between the groups in the experimental condition - and if so, between which? (all the groups listened to the same audio BUT with different visual stimuli)

I have 100 response total per participant = 80 x 100 -> 8000 (40 in one condition and 60 in the other) and model including all these Fixed Effects seems to be the best one based on AIC/BIC (prioritizing BIC in my case), hence in the final model I decided to include both GENDERs.

Since I was first interested only in the group:condition interaction (and which groups differ if any) I used a mixed design ANOVA which showed that it is NOT significant.

I have then moved to LMM, I choose "the best" model which seems to include the genders as well. I tried Anova() on the single model and got SIGNIFICANT group:condition interaction. However, when I did pairwise comparison with glht() and Holm's adjustment there are no significant pairs for group:condition only (averaging on the gender variable). The same happens when I fit a new model with group and condition as the only fixed effect. Also the same outcome even if I use NO adjustment at all.

Any ideas what should one do in a situation like this? Is it OK to ignore ANOVA and use the pairwise comparison since I am interested in which groups differ in the experimental condition (and I need to make sure that they don't in the baseline condition)?

Karvin
  • 15
  • 5
  • Is this contrast between significant anova and non-significant post-hoc tests the problem? ***"I tried Anova() on the single model and got SIGNIFICANT group:condition interaction. However, when I did pairwise comparison with glht() and Holm's adjustment there are no significant pairs for group:condition only (averaging on the gender variable)."*** For this see https://stats.stackexchange.com/a/352612/164061 and the links. – Sextus Empiricus Oct 27 '18 at 07:21
  • Possible duplicate of [R Tukey HSD Anova: Anova significant, Tukey not?](https://stats.stackexchange.com/questions/352583/r-tukey-hsd-anova-anova-significant-tukey-not) and [How can I get a significant overall ANOVA but no significant pairwise differences with Tukey's procedure?](https://stats.stackexchange.com/questions/16665/how-can-i-get-a-significant-overall-anova-but-no-significant-pairwise-difference) – Sextus Empiricus Oct 27 '18 at 07:42
  • Thank you! I am just wandering, this seems to be more related to Tukey than Holm (which I used) or am I looking at it the wrong way. – Karvin Oct 27 '18 at 07:46
  • It is just a different adjustment, but the principle remains the same. For the same anova result you can have different distributions of the pairwise differences which may or may not turn out to be significant in pairwise comparisons. – Sextus Empiricus Oct 27 '18 at 07:48
  • Thank you Martijn. Any idea what is the common practice in a situation like this? Ignore ANOVA? – Karvin Oct 27 '18 at 07:49
  • Why did you do post-hoc? 1) to find out whether there is a significant result or 2) just to find out which groups have the most strong difference? If the answer is the second case then I would ignore the significance levels in the post-hoc test (What is the meaning of significance anyway? Is it an arbiter for whether there is truly a difference or whether your test is precise enough?). – Sextus Empiricus Oct 27 '18 at 07:58
  • 2) because that was my initial research question, what I really need to see is if participants in one of the groups are rating accent lower/higher than in the control group/other groups. The condition variable has 2 levels: baseline (which was the same for all groups) and experiment (which was different for each group - different stimuli shown with audio). It seems that if I ignore the post-hoc I can only say that there is an interaction but I cannot say where and which groups are different. – Karvin Oct 27 '18 at 08:03
  • Could you edit your question and clearly specify the following trio: your research question, your hypothesis, as well as the goal of the research (what can we do with it). – Sextus Empiricus Oct 27 '18 at 10:34
  • Oh you are right I was not clear about that, I'm sorry I have edited it. – Karvin Oct 27 '18 at 23:42
  • The problem with many pairwise comparisons is that you loose a lot of power. How do you define a difference, or do you have some expectation how the differences may look like? E.g. you could also compare every group with the mean of all groups giving you only $n$ comparisons instead of $\frac{1}{2}n(n-1)$. – Sextus Empiricus Oct 28 '18 at 07:08
  • Thank you Martijn for your help. One of my groups continues with the same stimuli as in the baseline (so this groups first of all is the baseline for other groups - and was coded as such for the lmer() model - simply changed the name so it's the first one alphabetically). So first of all I suspect that the groups will differ from this control group. Also I think they may be difference among these groups. Two saw Asian faces (one video, one picture), two saw Caucasian (one video, one picture). – Karvin Oct 31 '18 at 01:32
  • For now I was checking the groups pairwise with glht(), there are 5 groups and 2 conditions, not sure if that is a lot - this is a basic model before I even care about the gender. I have 16 subjects in each group - 100 responses per subject, so 1600 data points per group. – Karvin Oct 31 '18 at 01:32

0 Answers0