Study of OCR Quality

Roger T. Hartley and Kathleen Crumpton have published a paper, Quality of OCR for Degraded Text Images, available at: which analyzes how well a noise model can predict the number of OCR errors in a scanned document. The paper notes that Adobe’s Capture OCR tool finds more false negatives than it does false positives. They conclude that, “the noise model is not appropriate for word-level recognition engines like Capture.”.