This is for a case-control study. I need to get a p-value and an odds ratio with confidence intervals from my glm
, but I'm unsure of the best approach. I have the glm
set up as follows:
lroverall <- glm(diagnosis~variant+location, overall, family=binomial)
Diagnosis (case/control), variant (yes/no), and location (A,B,C) are all categorical variables taken from my 'overall' dataset.
summary(lroverall)
gives the output:
Call:
glm(formula = diagnosis ~ variant + location, family = binomial,
data = overall)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.42270 -0.73877 0.00005 0.00005 2.67713
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.5603 0.1727 3.244 0.001178 **
variantyes -1.2194 0.5367 -2.272 0.023095 *
locationA -1.2050 0.2045 -5.892 3.82e-09 ***
locationB -4.1156 1.0288 -4.000 6.32e-05 ***
locationC -0.9249 0.2524 -3.664 0.000248 ***
For p-value, does it make sense to take the Pr(>|z|)
for the variant (0.023)? Does this effectively measure association between diagnosis and variant while accounting for (removing?) effect of location? Or would I want to get a P-value for the overall model, or use a different test?
Similarly, is it appropriate to take the odds ratio for the variant (2.95e-01) calculated as below? :
exp(cbind("Odds ratio" = coef(lroverall), confint.default(lroverall, level = 0.95))
Odds ratio 2.5 % 97.5 %
(Intercept) 1.751193e+00 1.248321e+00 2.456640e+00
variantyes 2.954030e-01 1.031654e-01 8.458547e-01
locationA 2.996777e-01 2.007040e-01 4.474587e-01
locationB 1.631541e-02 2.172174e-03 1.225467e-01
locationC 3.965552e-01 2.417924e-01 6.503760e-01