2

I want to do a univariate analysis on a set of variables to see which predict a binary outcome. I want to discard some of them before performing logistic regression.

I am trying to understand if I can rely on the f-test outputs (as provided by f_classif in sklearn) when my variables are non-normal and the outcome is binary.

I understand that in a ols regression problem this f-test compares the variance of the residuals between a model with intercept only and the variance of a model with the variable included. So, I would think the original distribution of the dependent variables is not problematic. Now, in logistic regression I would think it is the same, but I can't find any background related to this f_classif for binary outcomes and I don't understand what residuals are compared.

My apologies in advance if this question is basic.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Sapiens
  • 164
  • 7

0 Answers0