3

I have a statistical question.

I have data from an experiment with two conditions (dichotomous IV: 'condition'). I also want to make use of another IV which is metric ('hh'). My DV is also metric ('attention.hh'). I've already run a multiple regression model with an interaction of my IVs. Therefore, I centered the metric IV by doing this:

hh.cen <- as.numeric(scale(data$hh, scale = FALSE))

with these variables I ran the following analysis:

model.hh <- lm(attention.hh ~ hh.cen * condition, data = data)
summary(model.hh)

The results are as follows:

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)        0.04309    3.83335   0.011    0.991
hh.cen             4.97842    7.80610   0.638    0.525
condition          4.70662    5.63801   0.835    0.406
hh.cen:condition -13.83022   11.06636  -1.250    0.215

However, the theory behind my analysis tells me, that I should expect a quadratic relation of my metric IV (hh) and the DV (but only in one condition).

Looking at the plot, one could at least imply this relation: enter image description here

Of course I want to test this statistically. However, I'm struggling now if and when to center the metric IV, because the interaction is still important (as there is only a quadratic relation in one condition).

Do I first center the variable and then compute the quadratic term or the other way round? Do I compute the interaction with both the linear and the quadratic term or only one of them?

My gut feeling would suggest something like this:

hh.sqr <- hh * hh

sqr.model.hh <- lm(attention.hh ~ hh.sqr + hh.cen * condition, data = data)
    summary(sqr.model.hh)

In a nutshell, I want to test whether there is a quadratic relation in one of my conditions. However, I am not sure which terms I have to include into the model (or whether I calculate hh.sqr * condition vs. hh.cen * condition -- or both)?!

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
Mathias
  • 41
  • 3
  • 4
    Centering gets in the way of understanding the model, and doesn't help anyway. Centering doesn't affect predicted values from the model, and tests of effect combine linear + quadratic terms (2 d.f. test) which is unchanged by centering. Same for interaction effect (2 d.f., interact a variable with both linear and quadratic terms). – Frank Harrell Apr 29 '16 at 14:21
  • Ok, maybe I posted the question the wrong way. The point is, that I expect a quadratic relation in one condition, but not in the other. And that's exactly what I want to test. However, I do not know which interaction terms I have to include into the model... Only this: sqr * condition ? or both the hh * condition and the sqr * condition ? – Mathias Apr 29 '16 at 14:29
  • What you described is an interaction between one simple variable and a 2-column variable; you need to interact the simple variable with both linear and quadratic terms and summarize evidence for interaction (shape change) with a 2 d.f. "chunk" test. – Frank Harrell Apr 29 '16 at 14:32
  • okay, thanks for the comment for the moment. Give me some time to read this :) http://stats.stackexchange.com/questions/27429/what-are-chunk-tests – Mathias Apr 29 '16 at 14:34
  • @FrankHarrell So, the chunk test for linear and quadratic terms would provide evidence for whether or not *both* terms are zero. In the case where one term is zero but the other is not, how would this test help in understanding which was which? – Mike Hunter Apr 29 '16 at 15:06
  • @DJohnson: Thanks for this interesting question. I've never heard of chunk tests before, so I'm still struggling with the idea behind those and whether this helps me. – Mathias Apr 29 '16 at 16:09
  • Another thing that's not clear to me is how chunk tests differ from contrasts. – Mike Hunter Apr 29 '16 at 16:55
  • It is fruitless to try to find out which one is significant, and besides creating a multiplicity problem and inflating type I error, further testing is unreliable. You can get all the inference you need without that other step. – Frank Harrell Apr 29 '16 at 17:01
  • A chunk test is just a contrast consisting of more than one effect. – Frank Harrell Apr 29 '16 at 18:29
  • @FrankHarrell Why fruitless? That response might hold in an academic setting but in most applied contexts, people want -- even demand -- unambiguous answers. – Mike Hunter May 01 '16 at 20:23
  • 1
    What makes you think that such 1 d.f. tests are unambiguous? It is precisely because they are ambiguous (especially the test of the linear term is meaningless) that I recommend against it. Academic settings have the same needs as you on this point. Also, testing the nonlinear term and changing the model ruins $p$-values and confidence intervals. The best result comes from just looking at the plotted quadratic fit, and its confidence band (even better: simultaneous confidence bands). – Frank Harrell May 01 '16 at 21:03
  • @FrankHarrell You've misread and misunderstood my point. – Mike Hunter May 03 '16 at 13:53
  • Kindly elaborate – Frank Harrell May 03 '16 at 20:24

0 Answers0