#StackBounty: #dataset #ocr #tesseract Resources containing OCR benchmark test-sets for free

Bounty: 200

I want to do an OCR benchmark for scanned text (typically any scan, i.e. A4). I was able to find some NEOCR datasets here, but NEOCR is not really what I want.

I would appreciate links to sources of free databases that have appropriate images and the actual texts (contained in the images) referenced.

I hope this thread will also be useful for other people doing OCR surfing for datasets, since I didn’t find any good reference to such sources.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.