I am using logistic regression to analyze some categorical data (binary response variable and categorical -- mostly binary -- predictor variables). For my model, I have something like A ~ B
and a hypothesis that the respondent's B
has some explanatory power over the choice of A
. When I run this regression, only the intercept shows a p-value within the range of statistical significance.
However, I have some other variable C
that assesses some pre-existing conditions for each respondent. When I run a logit regression on A ~ B + C
, C
has a very low p-value (statistically significant). That is to say, the pre-existing preferences that each respondent has, as reflected by C
appear to have an effect on their choice in A
.
My question then, is whether or not it is appropriate to add an interaction term for B*C
to my regression in this case. When I run the logit regression A ~ B * C
(or the equivalent to A ~ B + C + B:C
), both B
and C
and the interaction term B:C
have high statistical significance (low p-values). Is this statistically valid? Does it make sense for something to become statistically significant when an interaction term is added to the model?