In building a logistic regression model to predict if a product meets the standard, the data looks like below.
One Production Batch contains different Products.
There are always some Production Batches the whole batch doesn’t meet the standard. There are always batches meet too, on the other hand.
For the Production Batches that always (all products under the batch, like 113144) failed to meet the standard, and the Batches that always (all products under the batch, like 345118) meet the standard, would it be better to exclude them in the model building?
Thank you.