In siamese networks, the aim is to make closer the data from the same class and push far away the data coming from the different classes.
Suppose that we want a face identification system with 5 peoples face images (3 for each people) (P1, P2, ..., P5) and we want to train a Siamese network for face identification. While training, we first shuffle 5x3 images (Px_y: image y of participant x). My question is:
A) Do we give all combinations of pairs such as
Px_y and Px'_y' for each combination of x and x' = {1,2,3,4,5} and y and y' = {1,2,3}
B) (or) We train a separate siamese network for each people, such that
To train network of P1, give:
P1_y and Px'_y' for each combination of x' = {1,2,3,4,5} and y and y' = {1,2,3}
To train network of P2, give:
P2_y and Px'_y' for each combination of x' = {1,2,3,4,5} and y and y' = {1,2,3}
Which one is considered when people say train a siamese network. For C classes, is it able to compare each instance in one class and give the result if they came from the same class or do we need to train a model for each participant like one VS all. Why do we use siamese networks, is it aware of class information, or it only says that they are "same" or "not same"? If the last one is the case, I suppose that for face identification case, we have to train one siamese network for each person individually.