Ocr Software Reviews
Paper hasn't gone away. You've probably noticed that even in the digital era you still have stacks of hard-copy printouts, books, magazines, newspaper clippings, invoices, bills, and other paper that you have to search through by hand, one page at a time. Or you need to get an old essay that you typed or printed years ago into digital format, and you're dreading retyping it. This is where OCR (Optical Character Reading) software becomes more of a necessity than a luxury. OCR creates searchable, editable text from printed documents—and also from photos of printed documents, or PDFs made from scanning old books and papers. The more paper documents you have, the more you need OCR.
OCR Software Reviews at ScanStore.com Freeware OCR Software and Royalty Free OCR SDK OCR Software Reviews at SimpleOCR.com Document Scanning, OCR and Barcode Recognition Software OCR Software Reviews at SimpleIndex.com Mortgage Document Scanning and OCR Find Pipettors and Pipette Tips Click Here to find OCR Software Reviews.
When to OCR
You use OCR for two basic functions: archiving documents or repurposing documents. For archiving, you'll typically feed your documents (receipts, business cards, handouts, or anything else) into your scanner and let your OCR software create searchable PDF files that show a scanned image of the original document but also contain—hidden underneath the scanned image—text that you can copy from the PDF and paste into other applications, or that you can search for when you need to find the original.
For repurposing, OCR typically converts a printed table into an Excel spreadsheet, or an old book either into a PDF with searchable text hidden under the page images or into a word-processing document that you can edit and reuse. High-powered OCR software can also convert printed text into HTML files that anyone can view in a browser.
Choosing OCR Software
When you choose an OCR app, you'll want to decide whether you want it to run automatically, interactively, or a combination of both. When an OCR app runs automatically, all you do is click a button, walk away, and come back to find your output files already created. When it runs interactively, you typically use image-enhancement tools to straighten or sharpen an image, layout tools to block out parts of a page that you don't want in the output, and then a proofreading tool to correct any misreadings by the software. With most apps, you can choose between automation and interaction by giving you a set of interactive tools and letting you decide which ones to use. But read or reviews to see how much freedom of choice you get with each individual app.
Behind the Scenes
Behind the interface of every OCR app is built on a character-recognition engine that does the grunt work of converting images into text. The fanciest interface can't make up for the limits of a recognition engine that isn't consistently accurate—and it's no accident that our Editors' Choice products have the strongest available recognition engines.
Featured OCR Software Reviews:
ABBYY FineReader 11 Review
MSRP: $280.00Pros: Powerful, flexible OCR software, smoothly automated for high-volume and hands-off operations, with precision correction tools for difficult tasks. The superb Verification tool makes it easy to correct doubtful readings by comparing OCR text to the original.
Cons: Some advanced options menus could use better explanations.
Bottom Line: The highest-power OCR software on the market, indispensable for anyone who needs fast, accurate text-recognition.
Read ReviewABBYY FineReader Express Edition for Mac Review
MSRP: $99.99Pros: The most accurate OCR engine available, in the simplest possible OCR interface. One-click conversion of scanned images or image files into text, worksheet, HTML, or searchable PDF output.
Cons: No editor inside the app for correcting OCR errors or adjusting images. No support for scanners connected through a wireless network.
Bottom Line: Despite the lack of a built-in editor or image-correction tools, still the best OCR available on the Mac.
Read ReviewAbbyy FineReader Touch (for iPhone) Review
MSRP: $2.99Pros: Lets you image documents and save them to searchable, editable form. Converts saved documents as well. Good overall OCR quality.
Cons: Only for recent iPhones, iPads, and iPods touch. Good OCR quality requires good lighting and document positioning.
Bottom Line: Abbyy FineReader Touch (for iPhone) lets you image documents with an iPhone and save them through the cloud to searchable, editable text.
Read ReviewOmniPage Ultimate Review
MSRP: $499.99Pros: Powerful OCR software with fine-tuned automation for high-volume corporate OCR tasks. Interface includes direct input from Dropbox, SharePoint, and other cloud services. Excellent text-to-speech module.
Cons: Confusing and inconsistent interface.
Bottom Line: Exceptionally high-powered OCR, with a seemingly unlimited range of features, but with a flawed interface.
Read ReviewPrizmo (for Mac) Review
MSRP: $49.95Pros: Flexible, up-to-date app. OCR for photos or scanned images. Captures photos taken from an iPhone or iPod connected to a Mac. Many options for image adjustments. Can extract text from images in any OS X app.
Cons: Comparatively weak OCR engine. Slightly overcomplex and underdocumented workflow.
Bottom Line: Prizmo is a terrific app for performing OCR on iPhone photos, but it has a far less effective OCR engine than ABBYY FineReader Express.
Read Review
This comparison of optical character recognition software includes:
- OCR engines, that do the actual character identification
- Layout analysis software, that divide scanned documents into zones suitable for OCR
- Graphical interfaces to one or more OCR engines
- Software development kits that are used to add OCR capabilities to other software (e.g. forms processing applications, document imaging management systems, e-discovery systems, records management solutions)
Name | Founded year | Latest stable version | Release year | License | Online | Windows | Mac OS X | Linux | BSD | Programming language | SDK? | Languages | Fonts | Output Formats | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Google Drive OCR or Google Cloud Vision | 2015 | Free | Yes | Browser | Browser | Browser | Unknown | Unknown | Yes | 200+ | All fonts | text | Google blog post [1][2] | ||
Tesseract | 1985 | 4.0.0 | 2018 | Apache | No | Yes | Yes | Yes | Yes | C++, C | Yes | 100+[3] | Any printed font | Text, hOCR,[4] PDF, others with different user interfaces[5] or the API | Created by Hewlett-Packard; under further development by Google[6] |
Readiris | 1986 | 16 | ? | Proprietary | ? | Yes | Yes | ? | ? | ? | Yes | 100+[7] | ? | ? | Owned by Canon |
CIB OCR[8] | 2011 | 2.08.00 | 2018 | Freeware | Yes[9] | Yes | Yes | Yes | Yes | C++, Java, Python, Objective-C | Yes | German, English, Spanish, Russian, Chinese, Japanese, Italian, French | Any printed font | Text, hOCR, PDF | CIB OCR supports more than 160 input formats |
Screenworm | 2013 | 1.0 | 2014 | Proprietary | No | No | Yes | No | No | Objective-C++ | No | 57 | ? | TXT | Product of Funchip. Uses the Tesseract OCR-engine. |
ExperVision[10]TypeReader & RTK | 1987 | 7.1.170.1125 | 2010 | Proprietary | Yes | Yes | Yes | Yes | Yes | C/C++ | Yes | 21 | 2618 | Has a Mobile and Embedded System version for iOS/Android/etc. | |
AliusDoc AD-SCI[11] | 2005 | 2.1 | 2015 | Proprietary | No | Yes | No | No | No | VB.Net | For Extensions | All ASCII-compatible languages | ? | XML, PlainText, any other thru SDK extensions | Minimal need for post-sale Professional Services. Works with structured, semi-structured, and unstructured documents. |
ABBYY FineReader | 1989 | 14 | 2017-01-25 | Proprietary | Yes | Yes | Yes | Yes | Yes | C/C++ | Yes | 192[12] | ? | DOC, DOCX, XLS, XLSX, PPTX, RTF, PDF, HTML, CSV, TXT, ODT, DjVu, EPUB, FB2[13] | ABBYY also supplies SDKs for embedded and mobile devices. Professional, Corporate and Site License Editions for Windows, Express Edition for Mac.[14] |
E-aksharayan | 2010 | Yes | No | Yes | No | 14 | RTF, TXT, BRL | ||||||||
Asprise OCR SDK | 1998 | 15 | 2015 | Proprietary | Yes | Yes | Yes | Yes | Yes | Java, C#,VB.NET, C/C++/Delphi | Yes | 20+[15] | ? | Plain text, searchable PDF, XML[16] | Java, C#, VB.NET, C/C++/Delphi SDKs for OCR and Barcode recognition on Windows, Linux, Mac OS X and Unix.[17] |
Nicomsoft OCR SDK | 1999 | 5.5 | 2015 | Proprietary | No | Yes | No | Yes | No | C#, VB.NET, C++, Delphi, Java | Yes | 25+[18] | ? | Searchable PDF, Text, RTF | C#, VB.NET, C++, Delphi, Java OCR tool for Windows and Linux.[19] |
AnyDoc Software | 1989 | ? | ? | Proprietary | No | Yes | No | No | No | VBScript | ? | ? | ? | Works with structured, semi-structured, and unstructured documents. | |
LEADTOOLS[20] | 1990[21] | 19.0 | 2014 | Proprietary | Yes | Yes | Yes | Yes | No | C/C++, .NET, Objective-C, Java, JavaScript | Yes | 56[22] | Any printed font | PDF, PDF/A, DOC, DOCX, XLS, XPS, RTF, HTML, ANSI Text, Unicode Text, CSV[23] | Supports Latin, Asian, Arabic, and MICR character sets.[20] For full page, zonal, and form image processing. Includes OCR, barcode, OMR and forms recognition.[24] ICR (handwritten text recognition) is supported.[25] |
CuneiForm | 1996 | 1.1 | 2011-04-19 | BSD variant | No | Yes | Yes | Yes | Yes | C/C++ | Yes | 28 | Any printed font | HTML, hOCR, native, RTF, TeX, TXT[26] | Enterprise-class system, can save text formatting and recognizes complicated tables of any structure |
OCR.space | 2015 | 3.02 | 2017 | GPL | Yes | Yes | No | No | No | C# | Yes | 23 | Any printed font | TXT | Windows desktop software, Windows Store application and online web app - converts scanned documents to editable text documents using OCR. |
SimpleOCR | 2002 | 3.5 | 2008 | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | ||
Dynamsoft OCR SDK | 2003 | 8.2 | 2012 | Proprietary | Yes | Yes | No | No | No | C/C++ | Yes | 40+[27] | ? | PDF, TXT | |
OmniPage | 1970s | 19.2 | 2015 | Proprietary | Yes | Yes | Yes | Yes | No | C/C++, C#[28] | Yes | 125[29] | Machine and handprinted fonts | DOC/DOCX XLS/XLSX PPTX RTF PDF PDF/A Searchable PDF HTML Text XML ePUB MP3 | Product of Nuance Communications |
Microsoft Office OneNote 2007 | 2011 | ? | 2007 | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | ||
FreeOCR | ? | 4.2 | August 2012 | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | [30] | |
gImageReader[31] | 2009 | 3.2.99 | 2017-07 | GPL | No | Yes | Yes | Yes | No | C++ | ? | 100+ | Any printed font | TXT, PDF, hOCR | uses Tesseract OCR engine |
GOCR | 2000 | 0.52[32] | 2018-10-15 | GPL | Yes[33] | Yes | Yes | Yes | Yes | C | ? | 20+ | ? | ||
Ocrad | ? | 0.26[34] | 2017-03-31 | GPL | Yes | Yes | Yes | Yes | Yes | C++ | Yes | Latin alphabet | ? | Command line | |
SmartScore | 1991 | 10.5.8 | 2015-07 | Proprietary | No | Yes | Yes | No | No | ? | ? | ? | ? | For musical scores | |
Microsoft Office Document Imaging | ? | Office 2007 | 2007 | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | Uses OmniPage[citation needed] | |
OCR.net | 2016 | ? | 2016 | Proprietary | Yes | No | No | No | No | Java, C++, PHP, Objective-c | No | 100+ | ? | TXT, Searchable PDF | Online service powered by PDF OCR X for conversions. |
PDF OCR X | 2008 | 3.0.11 | 2018 | Proprietary | No | Yes | Yes | No | No | Java, C++, Objective-C | No | 100+ | ? | TXT, Searchable PDf | Drag and drop UI. |
Puma.NET | ? | ? | 2009-10-29 | BSD | No | Yes | No | No | No | C# | Yes | 28 | Any printed font | .NET OCR SDK based on Cognitive Technologies' CuneiForm recognition engine. Wraps Puma COM server and provides simplified API for .NET applications | |
ReadSoft | ? | ? | ? | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | Scan, capture and classify business documents such as invoices, forms and purchase orders integrated with business processes. | |
Scantron | ? | ? | ? | Proprietary | No | Yes | No | No | No | ? | ? | ? | ? | For working with localized interfaces, corresponding language support is required. | |
OCRFeeder | 2009-03 | 0.8.1 | 2014-12-22 | GPL | No | No | No | Yes | No | Python | ? | ? | ? | Features a full user interface and has a command-line tool for automatic operations. Has its own segmentation algorithm but uses system-wide OCR engines like Tesseract or Ocrad | |
OCRopus | 2007 | 1.3.3 | 2017-12-16 | Apache | No | No | Yes | Yes | Yes | Python | ? | All languages using Latin script (other languages can be trained) | Normal Latin script and Fraktur (other scripts can be trained) | TXT, hOCR[35], PDF[36] | Pluggable framework under active development, used for Google Books |
MathOCR | 2014 | 0.0.3 | 2015 | GPL | No | Yes | Yes | Yes | Yes | Java | ? | ? | ? | HTML, LaTeX | Features mathematical formula recognition and logical layout analysis, can use OCR engines like Tesseract or Ocrad as back-end. |
MeOCR | 2012 | 1.0.0 | 2012 | Freeware | No | Yes | No | No | No | C/C++/C# | Yes | 28 | Any printed font | HTML, hOCR, native, RTF, TeX, TXT | Windows application. Converts scanned documents to editable text documents using OCR and exports them to Microsoft Word with one click. Features a full user interface and also has a .NET Interface library[37] for developers. |
Yunmai OCR SDK | 2002 | 1.0 | 2013 | Proprietary | Yes | Yes | Yes | Yes | Yes | Java, C++, C, object pascal, objective-C | Yes | 14 | Any printed font | TXT, PDF | Has the advantage of Chinese characters recognition.[38] |
Anyline SDK | 2013[39] | 3.5.1[40] | 2016[40] | Free non-commercial use[41] | No | No* | No* | No* | No* | Java (Android), Objective-C & Swift (iOS), C# (Windows Phone, Xamarin), JavaScript (Cordova)[42] | Yes[41] | 2 (German, English) | Any printed trainable font[43] | Plain text, verification image | *Customizable mobile OCR SDK for Android, iOS, Windows Phone, Smart glasses (Google Glass, Epson Moverio,...) |
Name | Founded year | Latest stable version | Release year | License | Online | Windows | Mac OS X | Linux | BSD | Programming language | SDK? | Languages | Fonts | Output Formats | Notes |
Evaluation[edit]
An analysis of the accuracy and reliability of the OCR packages Google Docs OCR, Tesseract, ABBYY FineReader, and Transym, employing a dataset including 1227 images from 15 different categories concluded Google Docs OCR and ABBYY to be performing better than others.[44]
References[edit]
- ^Dmitriy Genzel; Ashok Popat (May 6, 2015). 'Paper to Digital in 200+ languages'.
- ^Ashok Popat (Sep 4, 2015). 'IEEE SPS: Optical Character Recognition for Most of the World's Languages'.
- ^Based on count of language training files for version 3.04. Available at the download page.
- ^Usage explained in the Tesseract Readme and FAQ
- ^Such as ODF with OCRFeeder
- ^'GitHub - tesseract-ocr/tesseract: Tesseract Open Source OCR Engine (main repository)'. Retrieved 2018-11-05.
- ^http://www.irislink.com/EN-GB/c1462/Readiris-16-for-Windows---OCR-Software.aspx
- ^'CIB ocr'. cib.de. 2018-10-01. Retrieved 2018-10-01.
- ^'CIB doXiview'. cib.de. 2018-10-01. Retrieved 2018-10-01.
- ^'OpenRTK – ExperVision OCR SDK OCR Software, OCR SDK & Toolkit, OCR Service – ExperVision OCR'. Expervision.com. Retrieved 2013-09-12.
- ^'AliusDoc AD-SCI'. AliusDoc.com. Retrieved 2015-10-16.
- ^'ABBYY FineReader 14: Technical Specifications'. Finereader.abbyy.com. Retrieved 2017-02-23.
- ^'ABBYY FineReader 11: Technical Specifications'. Finereader.abbyy.com. Retrieved 2013-09-12.
- ^'Top OCR Software'. Ocrworld.com. 2010-03-30. Retrieved 2013-09-12.
- ^'Asprise OCR SDK Features'. asprise.com. Retrieved 2014-06-21.
- ^'Asprise Java OCR Library Features'. asprise.com. Retrieved 2014-06-21.
- ^'Asprise Java, C#/VB.NET OCR API'. asprise.com. 2015-11-19. Retrieved 2015-11-19.
- ^'Nicomsoft OCR SDK Features'. nicomsoft.com. Retrieved 2015-01-08.
- ^'Nicomsoft OCR, C#/VB.NET OCR API'. nicomsoft.com. 2015-01-08. Retrieved 2015-01-08.
- ^ ab'Ocr Sdk'. Leadtools. Retrieved 2013-09-12.
- ^'LEAD Technologies, Inc. Corporate Information'. Leadtools.com. Retrieved 2013-09-12.
- ^'Ocr Sdk'. Leadtools. Retrieved 2013-09-12.
- ^'OCR SDK Output Formats'. Leadtools. Retrieved 2013-09-12.
- ^'LEADTOOLS Recognition Imaging Developer Toolkit'. Leadtools.com. Retrieved 2013-09-12.
- ^'Icr Sdk'. Leadtools. Retrieved 2013-09-12.
- ^Debian manual page for Cuneiform for Linux version 1.1.0
- ^'OCR SDK Language Packages Download'. Dynamsoft.com. Retrieved 2013-09-12.
- ^'OmniPage CSDK - OCR Document Capture Toolkit Document Imaging & OCR'. Nuance. Retrieved 2013-09-12.
- ^'OmniPage Standard Document Conversion'. Nuance. Retrieved 2014-02-25.
- ^'Free OCR Software - Optical Character Recognition Software for Windows import from PDF and Twain Scanners'. Paperfile.net. Retrieved 2013-09-12.
- ^'gImageReader'. github.com. Retrieved 2018-03-25.
- ^'GOCR Homepage'. wasd.urz.uni-magdeburg.de. Retrieved 2018-10-17.
- ^'GOCR'. Jocr.sourceforge.net. Retrieved 2013-09-12.
- ^Diaz, Antonio (2015-04-16). 'GNU Ocrad 0.26 released' (Mailing list). info-gnu.
- ^OCRopus includes the ocropus-hocr tool which produces hOCR from the recognition results.
- ^In combination with the hocr-tools
- ^'MeOCR .NET Library'.
- ^'List of Yunmai OCR SDKs'. yunmai.com. Retrieved 2015-07-12.
- ^'Company Anyline'. Anyline. 2016-06-30. Retrieved 2016-06-30.
- ^ ab'Release Notes Archives - ANYLINE'. ANYLINE. Retrieved 2016-06-30.
- ^ ab'anyline'. npm. Retrieved 2016-06-30.
- ^'API Reference'. documentation.anyline.io. Retrieved 2016-06-30.
- ^'Fonts Anyline'. Anyline. 2016-06-30. Retrieved 2016-06-30.
- ^Assefi, Mehdi (2016-12-01). 'OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym'. Research gate. Retrieved 2019-01-31.