SHA-1 Hash Function Has Been Broken

A SHA-1 hash function, like the MD5 hash function, is used in electronic discovery to determine whether or not two electronic files are duplicates. Now like MD5, the SHA-1 hash function it has been compromised. A cryptology group at the CWI Institute in the Netherlands worked with Google to make it possible to generate PDF files with different content, but identical SHA-1 hash values. They announced the results of their project on February 23, 2017. Hash values are not only used in electronic discovery to detect duplicates, but are also used by internet browsers' Transport Layer Security (TLS) and Secure Sockets Layer (SSL) protocols to transmit data online, and for digital signatures on legal documents.

The CWI/Google team has posted a notice of its success in breaking SHA-1 to the site, . From this site you can download two PDFs, one blue, and one red, that have identical SHA-1 hash values. I did so, and tested them out, and as you can see in this screen they do indeed have identical SHA-1 hash values.

The team has also posted their research paper to the site, which goes into great technical detail about their research. I'll admit that I don't understand much of the math in the paper, but here are a few key points that it makes which will be helpful to the electronic discovery professional:

1. Their method requires immense computation power, but is 100,000 faster than a brute force attack.

2. Theoretical attacks against SHA-1 have been proposed since 2005, but it continues to be widely used. MD5 was broken in 2004. SHA-1 was developed by NIST in 1995.

3. The team's attack is estimated to take 150 days on a single quad-core CPU.

4. Using computational power rented from Amazon, the collision attack would cost $110,000 to implement.