0

I have 1-D data: the output of a Cosine similarity (COS) distance between two features. The data is bounded by [-1,1].

When I just try to find the optimum threshold for class labels - I get an accuracy for the testing data of about 75%

Then I thought to maybe train a binary classifier (namely wx+b - 1 layer of linear fully connected and pass it to a sigmoid and then to binary loss function) When I do the following I get only 71% accuracy.

What do you think is the difference between the two approaches? Why is the threshold method get higher accuracy? What am I doing wrong?

AdamO
  • 52,330
  • 5
  • 104
  • 209
eran
  • 1
  • Very closely related possibly an outright duplicate: [Classification probability threshold](https://stats.stackexchange.com/q/312119/1352) – Stephan Kolassa Feb 14 '19 at 17:48
  • I think it quite different, I will try to emphasise the difference, here I am wondering what is the difference between just searching for an optimal margin and applying a linear transformation and a Loss function – eran Feb 14 '19 at 17:54
  • Searching for an "optimal margin" is nothing else than using a particular (implicit) loss function. – Stephan Kolassa Feb 14 '19 at 17:54
  • so what in the process do you see wrong? because the results are very different – eran Feb 14 '19 at 18:10

0 Answers0