top of page

I am currently using an older version [v.11 - Professional Edition] of the Abbyy FineReader OCR software to render text for several thousand PDFs. The software does something which I have found to not be possible with Adobe Acrobat Pro or version 12 of the FoxIt PDF Editor software. It will successively pick up a check mark or 'X' in a check box.


The software often gets the mark perfectly as:

[X]


. . . but will sometimes substitute a 'K' or if there's a check in the box:



. . . it will enter a minuscule character like this: ø


An empty checkbox may get converted to an 'n' or just an empty box: □


But it interprets the marks consistently so if you're analyzing thousands of documents you can successfully track when checkboxes were ticked off.




 
 

You can use a simple PowerShell script to extract the contents of multiple zip files. The command Expand-Archive followed by a file path to a zip file, will extract the contents to a location specified in the script.


Create a text file with each line beginning with Expand-Archive -LiteralPath '


. . . followed by the file paths, followed by  -DestinationPath 

. . . followed by the location that you want the files to be extracted to:


Expand-Archive -LiteralPath "C:\foofolder\Test.zip" -DestinationPath "C:\foofolder\extracthere"

Expand-Archive -LiteralPath "C:\foofolder\Litigation Support.7z" -DestinationPath "C:\foofolder\extracthere"


Open Windows PowerShell ISE (x86), and paste this script into the script pane.



Note that this method will not work for 7zip files.



I tested this tonight and successfully extracted more than 50,000 files from almost 50 zip files.



 
 

If you need to generate a list of random numbers in Excel you may want to avoid using the RANDBETWEEN formula. It will return duplicate numbers:



This is not ideal if your aim is to select numbered entries on a spreadsheet to review at random. You can generate a random list of numbers without duplicates by using the RAND formula instead. Enter the RAND formula in a column adjacent to the data set you are QCing.



The RAND formula will generate new numbers each time the spreadsheet is edited. To get a static list of randomly generated numbers copy the results and use the paste values option. Then you can sort the data by the formula results to randomly select entries to review:



 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page