Questions tagged [optical-character-recognition]

Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast).

23 questions
15
votes
3 answers

State-of-the-art ensemble learning algorithm in pattern recognition tasks?

The structure of this question is as follows: at first, I provide the concept of ensemble learning, further I provide a list of pattern recognition tasks, then I give examples of ensemble learning algorithms and, finally, introduce my question.…
9
votes
1 answer

Can deep learning determine if two samples of handwriting are by the same person?

I have dabbled using Tesseract CNN OCR on handwriting records before and was surprised by the accuracy. I am wondering, is it possible to use it, or something else, to determine if a sample of handwriting is written by the same person? I have…
4
votes
1 answer

Is CTC Loss function right for License Plate Recognition?

I trained some CNN model for license plate recognition using stacked LSTM and convolutional layers, but I got stuck in %88 accuracy. (This accuracy is on the whole license plate not one character). For training my model I used categorical cross…
4
votes
0 answers

Train Neural Network For Handwritten Chinese Characters

The article here: http://novanoid.github.io/2014/09/26/training-a-neural-network-to-recognize-handwritten-digits/ discusses and implements a way to recognize handwritten digits. For images with a quality of 256 square pixels and an output vector of…
4
votes
1 answer

How many samples do I need for OCR problems?

I am thinking about collecting samples of hand written digits (0 to 9) from people. I'll try to test different algorithms for optimal character recognition- some form of neural network and random forest may be! I have planned to collect 20 entries…
3
votes
1 answer

Can CNN detect text in arbitrary position of image?

My task is that: there are some text in some position (left, right, top, bottom center, etc) of an images. The style (include size, orientation, font, etc) of text is arbitrary and the content length is arbitrary too. I want train a classifier to…
3
votes
0 answers

How is prior knowledge of letter/word patterns incorporated into handwriting (or speech) recognition?

Using handwriting recognition as an example, we can train various models to recognise individual characters but to actually be useful we must incorporate prior knowledge of common character sequences, words and word sequences. How is this generally…
3
votes
0 answers

How can you use HMMs and ANNs for on-line handwriting recognition?

I've asked this question on cs.stackexchange before. It has a 20-hours remaining bounty there. On-line handwriting recognition is the task of converting a series of $(x(t),y(t))$ coordinates to symbols and words. In contrast to off-line…
2
votes
0 answers

Avoiding OCR performance coupling to upstream Bounding Box model

I have a model pipeline where I first use an object detection deep learning model to locate text regions in images of natural scenery (i.e. outdoor images), and then send the cropped region to a deep learning OCR model to read the text. Both models…
2
votes
2 answers

KNN outperforms CNN

Disclaimer: I am a programmer by trade, not a statistician, so please cater to my ignorance when explaining things and I apologize now if I make any incorrect assumptions Please consider the following problem: I am currently attempting to build an…
2
votes
0 answers

Simple OCR over individual words from a fixed dictionary

I have a series of images, each containing a single word from a known dictionary of 2048 words. The size, font, and position of the word is known ahead of time, and I simply need to tell which word from the dictionary is in the image. Standard OCR…
2
votes
2 answers

How to remove horizontal bar in Hindi word Matlab

I wish to remove the horizontal bar (Shirorekha) from the word to get characters from the following image, for character recognition. Any ideas as to how can I do that. I tried to use Hough Transforms for line and then looked for longest line. Blue…
1
vote
0 answers

If I know specific pair of characters that model confuses in OCR task how can I fixe it?

I train OCR model to recognize cyrillic handwritten text. I know, for example, that it confuses very often 'Б' with '6'. How can I use this information to fine tune the model ? Just in case, my architecture is resnet50 + transformer UPD: train data…
1
vote
1 answer

Incorrect predictions on extracted images from text

I trained a model in PyTorch on the EMNIST data set - and got about 85% accuracy on the test set. Now, I have an image of handwritten text from which I have extracted individual letters, but I'm getting very poor accuracy on the images that I have…
1
vote
1 answer

What NN architecture to use for documents OCR?

I recently go interested in document OCR and would like to gather some opinions on what NN to use. I wonder if there are any proven examples that I can exploit? I have heard of CNN+LSTM+CTC is good as an end-to-end model, but it's not easy to…
1
2