Why do you want to do this? If you have a good reason, there should be no problem. A better question (until you tell us your reason) is what can be achieved by doing this? Let us look at this in a simpler model (the issues will be the same for your more complicated model).
Let the response y
be binomial, with two predictors, x1
is continuous and x2
is 0/1, that is, binomial. What happens if you estimate two models, one for the subset of data with x2=0
, another for the subset with x2=1
?
Lets suppose that the model for the complete data is
$$
\text{logit}(p) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1 x_2
$$
where the last term represents an interaction between x1
and x2
(in this model, in more concrete terms, this means that the slope of the continuous predictor x1
is different for the two groups coded by x2
). Now, for the subsetted data, it will not be possible to estimate $\beta_2$ and $\beta_{12}$, so we will get two models
$$
\text{logit}(p^0) = \beta_0^0 + \beta_1^0 x_1
$$
and
$$
\text{logit}(p^1) = \beta_0^1 + \beta_1^1 x_1
$$
where the superindex (0 or 1) indicates the subset used (x2=0
, x2=1
).
Comparing the models we can see that (first using subset x2=0
) that
$$
\text{logit}(p^0) = \beta_0 + \beta_1 x_1 \equiv \beta_0^0 + \beta_1^0 x_1
$$
and then subset with x2=1
:
$$
\text{logit}(p^1) = \beta_0 + \beta_1 x_1 + \beta_2 + \beta_{12} x_1 \equiv \beta_0^1 + \beta_1^1 x_1
$$
From this we see (assuming the full model is the true model) that
(from x2=0
): $ \beta_0 = \beta_0^0 ; \beta_1 = \beta_1^0$ and from x2=1
:
$ \beta_0+\beta_2 = \beta_0^1 ; \beta_1 + \beta_{12}= \beta_1^1$.
So from the two, separate subsetted models you can approximate $\beta_{12}=\beta_1^1 - \beta_1^0$ and the effect of x2
, $\beta_2$ by
$ \beta_2 = \beta_0^1 - \beta_0^0$. In more prectical terms, this says that the interaction can be recovered as the difference in slope between the two subsetted models, and the effect of x2
by the difference in intercept between the two subsetted models. You could develop similar relations for your more complex model, by similar arguments.
So you can recover all the information from the two subsetted models, if this can be done in an statistically efficient manner is another question!