Wednesday, December 30, 2009

Optical Character Recognition (OCR) and Capture

So what is document capture software and what does it have to do with OCR applications.  So, I think first, we need to differentiate between scanning software and capture software.  Here is a good blog post that goes over the differences, with regards to SharePoint Scanning.  Scanning Software just gives you the ability to convert paper to a digital form, and then OCR.  Capture Software takes this a step further, and is really a catalyst for some enhanced processing with your recognition engine.  Typical capture software will allow you to perform zone OCR, scan multiple documents in a single stack through separation, perform OCR based separation or even analyze the OCR text for expressions and then automatically extract the data.  Document Capture software provides enhanced data extraction, as an example, as do other vendors like Kofax, AnyDoc, Captiva, etc.

So, I guess the whole point here is that OCR software in most cases just provides a basic framework for the conversion process.  you really need a capture application to harness the true power of any OCR or recognition engine.

