0

I am trying to conduct a logistic modeling analysis. In this analysis, the data is as the following:

Y: binary (0,1)

And for the independent variables, they are as the following:

X1: DrugA: categorical variable. Did patients take Drug A ( yes or no)

X2: DrugA_Conc: continuous variable. the concentrations of Drug A which also includes zero.

X3: DrugB: categorical variable. Did patients take Drug B (yes or no).

Please note that in the input dataset, patients belong to one of 3 groups:

  1. Did NOT take either Drug A nor Drug B (placebo)
  2. Took Drug A but NOT Drug B (Drug A alone)
  3. Took Drug A and Drug B (combination)

Please see the code below that generates a representation of the independent variables dataset:

DrugA<-data.frame(DrugA=factor(1))
DrugA_Conc<-data.frame(DrugA_Conc=rep(seq(from = 0, to = 100, length.out = 3)))
DrugB<-data.frame(DrugB=factor(c(0,1)))
mergd<-merge(DrugA,DrugA_Conc)
mergd2<-merge(mergd,DrugB)
DrugA_2<-data.frame(DrugA=factor(0))
DrugA_Conc_2<-data.frame(DrugA_Conc=rep(0,length.out = 3))
DrugB_2<-data.frame(DrugB=factor(0))
mergd3<-merge(DrugA_2,DrugA_Conc_2)
mergd4<-merge(mergd3,DrugB_2)
mergd5<-rbind(mergd2,mergd4)

My question is the following: How can I code a logistic model in R to predict the effect of DrugA_Conc on Y with and without the presence of DrugB, and have the output predicted for the 4 possible scenarios ( the 3 listed above) +

  1. Did NOT take Drug A but took Drug B (Drug B alone).

I tried the following codes in R but it did not work:

m1 <- glm(Y~ DrugB+DrugA*DrugA_Conc, data=all, family="binomial")
m2 <- glm(Y~ DrugB+I(DrugA*DrugA_Conc), data=all, family="binomial")

Please note that I am not trying to evaluate the interaction between DrugA and DrugA_Conc but rather trying to create predictions for 4 possible drug combination scenarios ( Drug A alone, Both drugs, none of the drugs,Drug B alone) based on the dataset that has the first 3 scenarios. Also, can the code be applied if I take the log of DrugA_Conc ( I mean log(DrugA_Conc))

In SAS, I found that a trick (multiplying DrugA*DrugA_Conc) can be applied to code the model like the following, which allows for predicting the 4 scenarios (Figure 1):

model Y(event='1') = DrugB DrugA*DrugA_Conc/

Figure 1: SAS output

Best regards,

Malek Ik
  • 11
  • 1
  • 2
    Is there a specific reason to use the indicator for drug A if the drug A concentration variable includes 0? I think it would be easy enough to recreate your scenarios with 2 variables, drugA_concentration and drugB. Then you’re interaction could be ‘DrugA_Conc*DrugB’ – Tomas Bencomo Apr 04 '20 at 01:16
  • Question is closely related to (and a possible duplicate of) [this question](https://stats.stackexchange.com/questions/372257/). – Ben Apr 04 '20 at 02:06
  • Thank you @Ben-ReinstateMonica!!! your link is on point. Based on that, I tried to following models: m1 – Malek Ik Apr 04 '20 at 20:58
  • See my answer below. – Ben Apr 04 '20 at 22:56

1 Answers1

1

Your question concerns the use of nested variables in a regression model, which is discussed in general in this related question. In your case you have indicator variables DrugA and DrugB with continuous nested variables DrugA_Conc and DrugB_Conc respectively. Moreover, in your particular case, your drug concentration variables fully determine the initial indicators, to the indicator variables are functions of the concentration variables.

In such cases, because of the functional relationship between the variables, inclusion of both DrugA and DrugA_Conc will mean that you have linearly dependent explanatory variables. Ordinarily, when dealing with nested variables, one ensures that the nested variable only enters the model through an interaction term with the required condition for it to be a meaningful variable. Thus, to use the concentration of Drug A in your model, you would do this through the interaction term DrugA:DrugA_Conc. However, in your case the initial baseline variable DrugA is fully determined by the nested variable DrugA_Conc, and so the interaction becomes redundant, and is equivalent to the baseline model term DrugA_Conc.

In view of this, I would suggest that your model only include DrugA_Conc (or equivalently DrugA:DrugA_Conc) and not DrugA. If you include the latter then it will show up as contributing nothing extra to the model, since it is a function of a model term that is already included.

Ben
  • 91,027
  • 3
  • 150
  • 376