top of page

Library of Congress - Index of File Formats.

Library of Congress maintains a site on digital preservation, On this is site there is a very helpful index which provides detailed information on common file formats used in data collections.

Clicking on the link for PDF, PDF (Portable Document Format) Family, you find in the Notes section that,

"'Architecturally there is only one limit in the PDF standard: the overall file size must be below ~10GB as the cross-reference tables which define the PDF structure use 10 bits.' The preceding paragraph offers a generous view of the potential size for a PDF. Many commentators argue that the limit for practicality is lower than those stated above. What matters is whether you can open a given PDF in any reasonable application, including Acrobat and Adobe Reader, mentioned above. Online forums also include reports like these examples: "It seems that the iPad has a limit of 30MB for displaying PDF files,' and 'users of GoodReader have reported flawless performance with files over 1 gig in size.'"

The Transparency section for 'Microsoft Outlook PST 97-2002 (ANSI)' states that:

"Joachim Metz in Personal Folder File (PFF) Forensics: Analyzing the Horrible Reference File Format says "the actual data of an item within a PFF is scattered over different data structures...the bad news for forensic analysis is that PFF obfuscates the information in the data structures which makes a basic text string search impossible."

For each format there is a detailed chart providing information on local use, sustainability, functionality file type signifiers, a general description and detailed notes. See the example for the ZIP_6_2_0, ZIP file format, Version 6.2.0 (PKWARE):

bottom of page