Can deep learning determine if two samples of handwriting are by the same person?

Question

I have dabbled using Tesseract CNN OCR on handwriting records before and was surprised by the accuracy. I am wondering, is it possible to use it, or something else, to determine if a sample of handwriting is written by the same person?

I have searched for solutions but it is difficult to get results for anything but OCR recognition with handwriting. Has anyone attempted this before or can point me in the right direction? Thank you.

Sycorax · Accepted Answer · 2020-07-07T05:28:44.233

This paper seems to do exactly what you want: recognize authorship of handwriting samples, even when the texts don't match.

"DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification" Linjie Xing, Yu Qiao. 2016.

Text-independent writer identification is challenging due to the huge variation of written contents and the ambiguous written styles of different writers. This paper proposes DeepWriter, a deep multi-stream CNN to learn deep powerful representation for recognizing writers. DeepWriter takes local handwritten patches as input and is trained with softmax classification loss. The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths. In addition, we find that different languages such as English and Chinese may share common features for writer identification, and joint training can yield better performance. Experimental results on IAM and HWDB datasets show that our models achieve high identification accuracy: 99.01% on 301 writers and 97.03% on 657 writers with one English sentence input, 93.85% on 300 writers with one Chinese character input, which outperform previous methods with a large margin. Moreover, our models obtain accuracy of 98.01% on 301 writers with only 4 English alphabets as input.

Siamese networks are used to compare things like signatures; it seems reasonable to try and extend this method to handwriting analysis. One challenge would be that whereas signatures are kind of like "stamps" in the sense that the writer will want to reproduce the same symbol over and over, two handwriting samples might not be writing the same words and phrases. So the success or failure of the project hinges on whether the neural network can recognize the writing style as distinct from the words.

Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Sickinger and Roopak Shah. "Signature Verification using a 'Siamese' Time Delay Neural Network." AT&T Bell Labs. 1994

This paper describes an algorithm for verification of signatures written on a pen-input tablet. The algorithm is based on a novel, artificial neural network, called a "Siamese" neural network. This network consists of two identical sub-networks joined at their outputs. During training the two sub-networks extract features from two signatures, while the joining neuron measures the distance between the two feature vectors. Verification consists of comparing an extracted feature vector with a stored feature vector for the signer. Signatures closer to this stored representation than a chosen threshold are accepted, all other signatures are rejected as forgeries.

Another approach is to use the triplet-loss and embedding strategies such as that used in FaceNet. Then you compare the embeddings by some means to decide if two images have the same or a different author. The success on faces taken from different angles and different lightning conditions is promising, and perhaps a better fit for matching handwriting samples.

Florian Schroff, Dmitry Kalenichenko, James Philbin. "FaceNet: A Unified Embedding for Face Recognition and Clustering"

Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets. We also introduce the concept of harmonic embeddings, and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.

Thank you! This is so helpful. Just what I needed. – Overstacker72 Nov 10 '19 at 17:57 — Overstacker72, Nov 10 '19 at 17:57

Can deep learning determine if two samples of handwriting are by the same person?

1 Answers1