1

Hi I am trying to know if are there confusion and interaction in a regression model, but I am not too sure how do it.

The first step for is create the model of the following way.

model = glm(data$enf ~ datos$smoke * datos$coffe * datos$trat, data = data, family = "binomial")

To know if there are interaction I used anova test of the model:

anova(model)

Response: data$enf

Terms added sequentially (first to last)


                                  Df Deviance Resid. Df Resid. Dev
NULL                                                 89    123.653
data$smoke                        1   42.790        88     80.863
data$caoffe                         1    0.001        87     80.863
data$trat                         1    0.086        86     80.777
data$smoke:data$coffe             1    0.069        85     80.708
data$smoke:data$trat             1   14.197        84     66.511
data$coffe:data$trat              1    0.000        83     66.511
data$smoke:data$coffe:data$trat  1    0.211        82     66.299

I am not sure if use anova over the model is enough for know the interaction between variables and I don't know to how interpret the output.

And I don't know how check the confusion.

Robert Long
  • 53,316
  • 10
  • 84
  • 148
Lololo
  • 13
  • 3
  • Note that `anova()` in this case can give surprising results. With the default R packages it does a sequential test, adding in predictors one at a time so that the results can differ depending on the order that the predictors were entered into the model. That might not be what you want. See [this page](https://stats.stackexchange.com/q/20452/28500) for the different types of ANOVA analyses. – EdM Jan 01 '20 at 21:59

1 Answers1

4

To see if there is a statistically significant interaction, you need to inspect the p-values, and sizes, for the estimated interaction(s). In R you can obtain these from

> summary(model)

An interaction occurs when the estimates for a variable change at different values of another variable, and here "variable" could also be another interaction. anova(model) isn't going to help you.

Confounding is an entirely different problem. Let's say you have a main exposure, say coffee drinking (since this seems to be involved in your data) and an outcome, say lung cancer. A simple bivariate analysis may find a positive association, leading to the possible claim that drinking coffee causes lung cancer. However, if we introduce another variable, smoking, we may find that the association between coffee and cancer disappears and we find a positive association of smoking with lung cancer. This is because smoking is associated with coffee drinking and lung cancer. In order for a variable to be a confounder it must be a cause, or a proxy of a cause, for both the outcome and the exposure.

Robert Long
  • 53,316
  • 10
  • 84
  • 148