0

OCR works best for black on white or binary images, where letters have sharp contours, no noise, and solid filling, i.e. they are nothing like this:

part of a document number

For the curious it's a part of a document number that is embossed. Applying unsharp mask, thresholding, closing and median operations already improves it a bit, but it still cannot be processed properly by OCR.

enter image description here

What other operations could bring the image to the preferable state like this manually edited one:

enter image description here

Rajish
  • 101
  • 2
  • aren't you answering your own question? *What did **you** do to achieve that third image?* – Marcus Müller May 25 '18 at 11:31
  • Like I said I manually edited the image with GIMP – Rajish May 25 '18 at 11:32
  • but *what* did you do in GIMP? – Marcus Müller May 25 '18 at 11:33
  • bucket fill and erase tool on the image in the middle. The operations to get the middle image are mentioned in the question. – Rajish May 25 '18 at 11:34
  • 2
    soooooo what about implementing "bucket fill" to.. fill the outer areas with black, then bucket fill on all areas "touching" the black with green? Then, just select "green"? It's what you've done before, but automatedly. You already work like an algorithm! You just need to learn to write that algorithm down. – Marcus Müller May 25 '18 at 12:21

0 Answers0