On my 4 variables logistic regression model, exhaustive grid search finds a much better solution than sklearn LogisticRegression.
Grid search delivers:
accuracy: 55.11%
log reg coefficients: 0.58 -0.19 -0.03 0.20
Sklearn LogisticRegression delivers:
accuracy: 53.16%
log reg coefficients: 0.0015 -0.0010 0.0016 0.0002
Am I using sklearn the wrong way or does sklearn really suck? :O
Below is my code...
df = pandas.read_csv("samples.tsv", sep="\t")
y = df.iloc[:,0].as_matrix()
x = df.iloc[:,1:].as_matrix()
logreg = sklearn.linear_model.LogisticRegression()
print("logreg:",logreg)
logreg.fit(x,y)
print("coefficients:", "\t".join(map(str,logreg.coef_[0])))
print("accuracy:", logreg.score(x,y))