0

I don't have SAS and the dataset with me, so I made up this table (from my memory). Basically this is what I got:

enter image description here

After deciding to leave the variable $age$ and $risk$ in my model, I created this interaction term. According to the yellow table, I suppose I should not drop this term since that would significantly impact my model (if I haven't interpreted it wrong). However, there are several levels within each class. We see that the test statistics of $old*standard$ is insignificant.

Question 1: Does it make sense to drop this particular level of interactions and run regression on the others?

Question 2: If that does make sense (which I doubt), do I create dummy variables on the reference levels as well? How would I interpret the intercept then?

3x89g2
  • 1,366
  • 1
  • 11
  • 26
  • 2
    Generally it doesn't make sense to drop levels of a factor in this fashion. Even if you're looking to collapse a categorical variable into some smaller set of categories, I wouldn't do it this way. Sometimes levels of factors won't be very different from that of the baseline (reference) category. That on its own isn't a good reason to drop them from the model. – Glen_b Aug 12 '14 at 01:42
  • 2
    See also Frank Harrell's answer [here](http://stats.stackexchange.com/questions/21762/can-a-factor-be-changed-to-binomial-levels-to-achieve-model-validation-and-extra) – Glen_b Aug 12 '14 at 01:48
  • 1
    Are there young people in your data? Take into account that most statistical software ignore observations with at least one missing value on any of the variables in the model. I ask this because of the estimates of exact 0 and N/As for the $p$-values makes me suspicious that something like that is going on. – Maarten Buis Aug 12 '14 at 07:12

0 Answers0