0

I am doing cats-vs-dogs competition. So it's binary classification.

I randomly pick pictures (i.g. N=7000), resize them to 100x100 and convert to greyscale. Then I use SGDClissifier with loss='log' (logistic regression). But test accuracy always shows approximately 0.5 (+-0.03).

It seems like this method does not work. I could always predict [1, ..., 1] and it would have the same accuracy 0.5.

I tried LogisticRegression class which is similar to SGDClissifier with loss='log', tried to tune eta0, max_iter, scaling features, shuffle, change N and nothing helped.

I think maybe logistic regression cannot solve this type of problem. Or maybe logistic regression prone to be ineffective to something that I use. But I can't find the reason.

  • 1
    [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) may be helpful. Or not, it's hard to say. – Stephan Kolassa Feb 29 '20 at 12:58
  • Image classification is hard to do without a lot of feature extraction work (HOG/SIFT, etc); alternatively, CNNs do this automatically, but then you'll need to train a neural network. – Sycorax Mar 03 '20 at 03:06
  • @SycoraxsaysReinstateMonica, I still think it's just not possible for linear regression to handle this problem. I didn't get accuracy higher than 0.55 (even this can be considered random) – George Zorikov Mar 03 '20 at 18:45
  • Probably not. That's why researchers invented alternative models. – Sycorax Mar 03 '20 at 22:21

0 Answers0