I have 1-D data: the output of a Cosine similarity (COS) distance between two features. The data is bounded by [-1,1].
When I just try to find the optimum threshold for class labels - I get an accuracy for the testing data of about 75%
Then I thought to maybe train a binary classifier (namely wx+b - 1 layer of linear fully connected and pass it to a sigmoid and then to binary loss function) When I do the following I get only 71% accuracy.
What do you think is the difference between the two approaches? Why is the threshold method get higher accuracy? What am I doing wrong?