0

I have the following group that was created based on an exposure:

Exposed group:

Patient_ID   Exposure   comorbidity1   comorbidity2   comorbidity3   comorbidity4   Age   gender   income   outcome
ptA          1         1               0               1                1             22    M          0      1
ptB          1         0               1               1                1             73    F          1      0
ptC          1         0               0               0                1             55    M          2      0
...

Then I chose the following group that were not exposed to the factor of interest such that for each person in the exposure group, I chose 4 age/sex matched individual from the entire population.

Unexposed Group:

Patient_ID   Exposure   comorbidity1   comorbidity2   comorbidity3   comorbidity4   Age   gender   income   outcome
ptA_match1     0         0               0               1                1             22    M          0      0
ptA_match2     0         1               1               0                0             22    M          1      1
ptA_match3     0         1               0               0                1             22    M          1      1
ptA_match4     0         1               1               1                0             22    M          1      0
...

I ran a logistic glm model as follows in R:

    glm(outcome ~ Exposure + 
        comorbidity1  + comorbidity2 + comorbidity3 + 
        comorbidity4 + Age + gender + income, family="binomial", 
        data=rbind(exposed, unexposed)) %>% summary()

My model gives very significant p-values (<2e-16) for both age and sex. The reason I matched for age and sex was to remove the confounding effect of those two variables. Am I using the right model? I am not entirely sure how to explain age/sex being significant confounders after matching.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 2
    Hi, there is an interesting answer to a similar question here: https://stats.stackexchange.com/questions/505639/including-matched-variables-in-regression-models – Ray Oct 25 '21 at 09:56
  • The matching method you used forces a fixed ratio of people who were exposed to people who were not exposed. You could have just chosen an uncontrolled sample of unexposed but you used a fixed ratio for each sex-age group combination. This assured that your estimates for the effects of exposed are uncorrelated with estimates for sex, for age, and for the combination. A different sampling method, where the subjects with the adverse outcome are matched to a fixed number without the outcome, with the same age and sex, will account for or remove the effects of age and sex. – David Smith Oct 25 '21 at 20:56
  • Age and sex might be correlated and their effects on outcome not separable but exposed is not correlated with the interaction of age sex, somewhat oddly. – David Smith Oct 25 '21 at 21:11

0 Answers0