0

I'm working on a project of face comparing using Siamese neural network architecture. For research purpuoses I'm using the LFW, CFPLFW (Front Profile) and CALFW (Cross Age) datasets.

btw, is there any way of working with those datasets and utilizing the ImageDataGenerator? I have tried writing my own generator but im afraid it doesnt work so well.

my network should get 2 images as an input, and tell if the images describe the Same person (0 label) or maybe they are different (1 label).

the network code is as follows:

def build_siamese_model(input_shape):
    """
    Builds the Siamese model which will learn similarity function.
    :param input_shape: the input shape to the network
    :return: siamese model
    """
    # The two input images
    first_face_input = Input(input_shape)
    second_face_input = Input(input_shape)

    # Convolutional Neural Network
    image_net_inner_cnn = Sequential()

    image_net_inner_cnn.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape, padding="same"))
    image_net_inner_cnn.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
    # model.add(BatchNormalization(momentum=0.8))
    image_net_inner_cnn.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='same'))
    image_net_inner_cnn.add(Dropout(0.25))

    image_net_inner_cnn.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
    # model.add(BatchNormalization(momentum=0.8))
    image_net_inner_cnn.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='same'))
    image_net_inner_cnn.add(Dropout(0.25))

    image_net_inner_cnn.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
    # # model.add(BatchNormalization(momentum=0.8))
    image_net_inner_cnn.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='same'))
    image_net_inner_cnn.add(Dropout(0.25))
    # #
    image_net_inner_cnn.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='same'))
    image_net_inner_cnn.add(Dropout(0.2))

    image_net_inner_cnn.add(Conv2D(16, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(Conv2D(16, (3, 3), activation='relu', padding="same"))
    image_net_inner_cnn.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='same'))
    image_net_inner_cnn.add(Dropout(0.2))

    image_net_inner_cnn.add(Flatten())

    image_net_inner_cnn.add(Dense(256, activation='relu'))
    image_net_inner_cnn.add(Dropout(0.1))
    image_net_inner_cnn.add(Dense(80, activation='relu'))

    image_net_inner_cnn.summary()

    # Generate the feature vectors for the two images -> their "encoding"
    feature_vector_left_img = image_net_inner_cnn(first_face_input)
    feature_vector_right_img = image_net_inner_cnn(second_face_input)

    # Lambda layer -> customized layer to compute the absolute difference between the encodings
    # Note: I could use the Subtract layer, but i want the ABSOLUTE difference.
    abs_diff_layer = Lambda(lambda vect: K.abs(vect[0] - vect[1]))
    distance = abs_diff_layer([feature_vector_left_img, feature_vector_right_img])

    # Add a fully connected layer with a tanh activation function to generate the similarity score  (0 = similar, 1 = diff)
    prediction = Dense(1, activation='sigmoid')(distance)  # PREDICTION MUST BE >= 0 -> NO TANH

    # connect all parts together
    siamese_net = Model(inputs=[first_face_input, second_face_input], outputs=prediction)

    # siamese_net.summary()

    # return the model
    return siamese_net

I Encountered in this project with 2 problems:

  1. With feeding images to the network on training. How do I use the ImageDataGenerator? I think it might be very risky to let it run freely since it's too random, and might give too many "Different" samples (getting 2 images of the same person randomly has a low probability).

  2. Because of problem (1.) I have tried writing my own data generator. For some very wierd reason which I cant understand, IMPORTANT NOTE: I added an option to tell the precentage of "Same" sample in a batch (say 0.8 -> 80% of samples will be "SAME" the other 20% will be "DIFF").

THE PROBLEM: when I tried training my model I saw that when I change the probability (which I added to my generator as an input) the accuracy changes as well. if the probability is 0.5, the accuracy will be around 0.5, if prob==0.9, accuracy will be around 0.7-0.8, which is very wierd

  • A closely-related variant of a Siamese neural network is a triplet network. An important part of training triplet nets is how you chose the triplets for computing the loss function; it turns out that this makes a big difference for how the network learns; see: https://stats.stackexchange.com/questions/475655/in-training-a-triplet-network-i-first-have-a-solid-drop-in-loss-but-eventually/475778#475778 I think what you've found is that random pairs don't provide much information, just as for triplet networks. I'd look for a similar paper for Siamese nets – Sycorax Jun 15 '21 at 16:30

0 Answers0