1

About 18 % of the patients in a study are females. Is that enough for us to use Gender as a factor?

More generally, how does one decide whether there are enough observations such that the factor should be included?

What if it was only 10 %? 5 ? 2?

shaucehi
  • 11
  • 1

1 Answers1

1

More generally, how does one decide whether there are enough observations such that the factor should be included?

What if it was only 10 %? 5 ? 2?

It depends on everything from the sample size, to how much you expect a priori for the feature to be important, to how many other features are in the model, to how variable the dependent variable is, to the purpose of the model.

In general, if a feature is almost always constant, you should remove it. If you're in doubt (and it's not so close to being constant that it threatens identifiability), use your favorite model-selection method to decide.

Kodiologist
  • 19,063
  • 2
  • 36
  • 68