Consider a dataframe ("df") with three variables (Happiness, Smoke, Depression), where (1) Happiness (DV) = continuous measure of happiness on 1-10 scale, (2) Smoke (IV1) = categorical variable of whether the person smokes (yes/no), and (3) Depression (IV2) = continuous measure of depression on 1-10 scale.
Happiness <- c(1, 2, 5, 6, 2, 7, 7, 3, 8, 9)
Smoke <- c("yes", "yes", "no", "no", "yes", "no", "yes", "yes", "no", "no")
Depression <- c(6, 8, 2, 1, 10, 4, 5, 1, 2, 3)
df <- data.frame(Happiness, Smoke, Depression)
Suppose I want to test whether Smoke x Depression interaction predicts Happiness (in other words, if the interaction between two Independent Variables predicts the Dependent Variable). So I use this formula:
summary(lm(data = df, Happiness ~ Smoke*Depression))
which gives me this:
Call:
lm(formula = Happiness ~ Smoke * Depression, data = df)
Residuals:
Min 1Q Median 3Q Max
-2.0000 -1.0460 -0.3788 0.8905 3.7826
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.6154 2.4745 2.269 0.0637 .
Smokeyes -1.3110 3.2748 -0.400 0.7028
Depression 0.5769 0.9489 0.608 0.5655
Smokeyes:Depression -0.7943 1.0011 -0.793 0.4578
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.164 on 6 degrees of freedom
Multiple R-squared: 0.6098, Adjusted R-squared: 0.4147
F-statistic: 3.125 on 3 and 6 DF, p-value: 0.1092
I need help interpreting this result.
- Is it correct to use Smoke*Depression instead of Smoke + Depression, or Smoke:Depression if I want to see the main effect of each independent variable and their interaction?
- Do the values under Pr(>|t|) mean the significance of main effect for each variable?
- If so, how do I test the main effect of non-smokers? (ie, why is there only "Smokeyes" and no "Smokeno"?)
- What does Smokeyes:Depression indicate? I am suspecting that it means the interaction between Smoke and Depression. If so, how is Pr(>|t|) of this different from the p-value?
Any help would be much appreciated. Thank you!