This question is about understanding the logistic regression output using R. Here is my sample data frame:
Drugpairs AdverseEvent Y N
1 Rebetol + Pegintron Nausea 29 1006
2 Rebetol + Pegintron Anaemia 21 1014
3 Rebetol + Pegintron Vomiting 14 1021
4 Ribavirin + Pegasys Nausea 5 238
5 Ribavirin + Pegasys Anaemia 12 231
6 Ribavirin + Pegasys Vomiting 1 242
7 Ribavirin + Pegintron Nausea 15 479
8 Ribavirin + Pegintron Anaemia 7 487
9 Ribavirin + Pegintron Vomiting 9 485
This basically describes the number of times a particular drug pair has caused a medically adverse event. (Y=yes, N=no
). I ran a logistic regression on this dataset in R using the following commands:
mod.form = "cbind(Y,N) ~ Drugpairs * AdverseEvent"
glmhepa.out = glm(mod.form, family=binomial(logit), data=hepatitis.df)
The summary output was as follows (only showing the coefficients table):
Estimate Std. Error z value
(Intercept) -3.8771 0.2205 -17.586
DrugpairsRibavirin + Pegasys 0.9196 0.3691 2.491
DrugpairsRibavirin + Pegintron -0.3652 0.4399 -0.830
AdverseEventNausea 0.3307 0.2900 1.140
AdverseEventVomiting -0.4123 0.3479 -1.185
DrugpairsRibavirin + Pegasys:AdverseEventNausea -1.2360 0.6131 -2.016
DrugpairsRibavirin + Pegintron:AdverseEventNausea 0.4480 0.5457 0.821
DrugpairsRibavirin + Pegasys:AdverseEventVomiting -2.1191 1.1013 -1.924
DrugpairsRibavirin + Pegintron:AdverseEventVomiting 0.6678 0.6157 1.085
I understand that the coefficients give probabilistic odds. I am curious however, as to why there are no coefficients for the AdverseEventAnaemea
and also why is there no coefficient for any combination of the drugs and the adverse event anaemea? (The last 4 rows are the combination effects of drugs and adverse events)