top of page

Attorneys are often forced to work with transcripts of deposition and trial testimony, taken in cases to which they are not a party, which are only available in condensed format. It may be helpful to get an accurate text file of a condensed transcript to load into a deposition transcript review program like LIveNote, or to analyze in another fashion. If we have a PDF of a transcript like this:

, , , simply saving the text of the transcript to a text file will leave us with a file in which the page and line numbers are jumbled together:

This problem can be overcome with the use of Abbyy FineReader. Import the PDF of the condensed transcript into FineReader.

In the Area menu, select Draw Area . . . Draw Text Area. Draw boxes around each of the condensed pages on the full page. Be sure to draw the boxes in the same order as the pages are numbered, and do not let them overlap.

Select all of these boxes and then go to Area . . . Save Area Template. The template will be saved as a file with a .blk extension.

Select all of the thumbnails of the PDFs and then go to Area . . . Loan Area Template and select the .blk file. Text Areas for each of the four sections of the condensed transcript will be added to all of the pages. With all of the thumbnail pages still selected, right click and click 'Read Selected Pages'.

You will then be able to save a text file with the pages and line numbers correctly aligned.



Be careful if you are exchanging PDFs that you have cropped with other parties. If you crop a PDF in Adobe Acrobat, even after you save it as a new file, and email it, another user can still open it in using different software.

As the screen grabs below show, it's possible to crop a PDF in Adobe Acrobat, and then undo the cropping action using Nuance Power PDF advanced.



It's very common to come across references in production protocols for images in the CCITT Group 4 tiff format. However not everyone that complies with these protocols necessarily understands what this format is.

CCITT stands for the Consultative Committee for International Telephony and Telegraphy. This organization developed the CCITT Group 4 format in the 1980s for use with fax machines. The same format may also be referred to as a Recommendation T.6 image. PDF images may also use CCITT Group 4 compression. CCITT Group 4 compressed images are lossless - the data integrity of the original is preserved - and can be compressed at a ratio of 15:1.

You can determine if an image is in the CCITT Group 4 format by right clicking on a file choosing Properties and clicking on the Details tab.

CCITT Group 4 compression is designed for bitonal images - black and white text images. The algorithm will not compress half tone images (those which use shades of grey) nearly as well. A continuous tone image would use a limitless number of shades of grey to depict an image. The commonly used methods of grayscaling uses either 16 or 256. Halftone replicates continuous tone through the use of tiny dots. See this example of how different an image can look using these three methods:

Lempel-Zif & Welch or LZW compression is preferred for grayscale tiff images. This method allows the grayscale or color images to be compressed at a ratio of 4:1. A CCITT Group 4 compressed halftone image may actually end up compressing at a size larger than the original file.

When dealing with document productions of many thousands of pages keeping files sizes to a minimum is a priority. This is why the CCITT Group 4 format for tiff images is so often used in the specs of document protocols.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page