Scan as text

agantuk

Level E
Looking for a good OCR. Have lots of pages to scan, and need to save them as text, instead of image. Contains only standard fonts.

Can someone suggest a good FREE OCR if available? If not, good commercial ones, but I will have to get the *ahem* versions for those :)
 
Thanks for the list. I haven't tried this yet - only Simple OCR, and the results aren't that great.

Which of the above would you recommend?
 
top ocr has been working fine for me . simple ocr , free ocr , top ocr ...

all these work with similar technology . perhaps you need abby finereader :)
 
Just tried out FreeOCR and TopOCR.

FreeOCR is average at best. Even standard pages seem to be a tough job for this.

However, TopOCR is awesome. Text only pages are fantastic. Even on pages filled with technical stuff having non-standard English words, it is doing a very good job.

Thanks a lot :)

Have started downloading *ahem* version of Abbyy. Let me see how this does. Will keep posted!
 
most of the times, Abby finereader is better than Acrobat pro's in-built OCR ...

it should suit your requirements :) there should be no need for the other OCRs !
 
^ I will, once I am done with my scans :)

Just finished with some of the pages. Used ABBYY this time round and am pretty impressed.

I was scanning some material which has some Java code in it, along with some which are plain English. The plain English ones came out great. The ones with the code weren't bad either. I didn't have to edit much in the final document. Attached are the original and scanned versions.

Set 1: Text + code

Original: 4shared.com - document sharing - download 01.pdf

Scanned text: 4shared.com - document sharing - download Part 01.pdf

Set 2: Text + tables

Original: http://www.4shared.com/document/S4BjgVDT/04_online.html

Scanned text: 4shared.com - document sharing - download Part 04.pdf
 
^ As stated in my previous post, I would go with ABBYY. Professional software, and does an extremely good job. Recommended if you have composite documents - pages with text and tables / XMLs / non-standard content.

For plain English documents, TopOCR would do the job well.
 
I installed the 'trojan' version. The trojan thing is BS, all sorts of cracks cause the AVs to go berserk with alarms. Haven't been impacted so far though.
 
Back
Top