top of page

Study of OCR Quality


Roger T. Hartley and Kathleen Crumpton have published a paper, Quality of OCR for Degraded Text Images, available at: https://arxiv.org/ftp/cs/papers/9902/9902009.pdf which analyzes how well a noise model can predict the number of OCR errors in a scanned document. The paper notes that Adobe’s Capture OCR tool finds more false negatives than it does false positives. They conclude that, “the noise model is not appropriate for word-level recognition engines like Capture.”.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page