In my dataset, there's a binary response, some factors, and some covariates. In particular, there are some covariates that are always present when factor1=="A"
and these covariates are always missing (NA
) when factor1=="B"
. This missingness is structural, not random; it doesn't make sense for the covariate to have a value when factor1=="B"
.
To fit this as a logistic regression, I centered the covariates about the mean, and then set the covariate to 0 wherever it had been NA
. I reason that in this way, the effect of the covariate on the response will be nullified whenever factor1=="B"
, which is appropriate.
Is this a reasonable approach to fitting these data?
What is the name for this approach, and where (web, textbook) is there a good discussion?