Where are my unsearchable PDFs?

Where are my unsearchable PDFs?

May 19, 2015

 

Recently, I finished rendering OCR for more than a thousand PDFs in Adobe Acrobat.   This took a few days because many of the PDFs were well over 100 pages long.   I had to stop and start the process a few times.  Now, I'm wondering if there are any PDFs that I neglected to OCR.   My first thought was to try to find an Adobe action which would check this.   There is an Adobe .sequ file on this site which purports to review PDFs, and then separate out the non-searchable PDFs into a separate subfolder: 

http://www.byteryte.nl/en/downloads/document-programming/124-ocr-check.html

 

However I could not get this to work in my  version of Adobe Acrobat XI.   I also tried the method described on this Adobe website blog:

http://blogs.adobe.com/acrolaw/2007/02/is_that_pdf_sea/

 

But this only generates a html file for each PDF file and does not provide a nice summary for multiple PDF files. 

 

I finally was successful using 'Count Anything', available at: http://ginstrom.com/CountAnything/ 

 

This application is free and easy to use.  You simply drag your PDF files into the application's window, and then click  'Count'.   Very quickly you'll get a report which looks like the image below. 

 You can easily copy and paste the report into Excel and filter in the Words or Chars column to detect your non-OCRed files.   This application works with Excel, Word, PowerPoint, text, and other files.  It's specifically designed to search for Asian characters.  Download it today!

 

Please reload

Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com