Siamese network consists of two identical networks. Networks share the same weights. The general workflow is as follows (taken from here):
Suppose that I have 10 images of apples, 10 images of bananas, 10 images of strawberries, and 3 of oranges. I want to create a system that can understand if a given image is an orange or not, a binary classification problem. However, the number of orange images is not enough. Therefore I use a Siamese network that can learn a feature vector to map similar images (oranges) closer (1) and differents far away (0).
Therefore to train my siamese network, I give a pair of 2 images at each time. For example:
apple1 and orange1
apple1 and orange2
apple1 and orange3
orange1 and orange1
orange2 and orange3
orange1 and orange3
...
And suppose that know a siamese network is trained and has high performance. But how can I use it in inference? What should I do to obtain my binary orange or not output? Do I need to use this network as a feature extractor? I was confused at that point.