I am using a logistic regression estimator with scikit-learn. The estimator I had trained predicts the same class all the time. (This is a 2 class identification problem.)
The data set is built of 2 classes which are a bit skewed (70% output 1 and 30 output 0). It has about 2000 samples and is built of 8 features for each sample. when I am fitting for logistic regression I get only 1's on the output so the confusion matrix looks like that:
Confusion Matrix:
[[ 0 53]
[ 0 155]]
- I tried to play with the the regularization parameter (from 1e-6 to 1e8) and nothing changes.
- The data base do not look linearly separable.
I would expect that the worst logistic regression estimator would at least yield P(y=1)=0.7 P(y=0)=0.3. Let me note that when using SVM with rbf kernel I get much better results. Below is the confusion matrix of the SVM:
Confusion Matrix: [[ 48 5] [ 15 140]]
Any idea why my logistic regression estimator is always predicting the same result?