top of page

Here's a continuation of my postings about the Electronic Discovery Institute's online e-discovery certification program, that you can subscribe to for just $1. I last blogged about this program on August 19, 2017. Go to https://www.lawinstitute.org/ to sign up for it.

Tonight I took the course on data remediation. It's taught by Anthony Diana, a partner at Reed Smith specializing in electronic discovery; David Castro, counsel for the Hess Corporation, who is responsible for electronic discovery; and Jamie Brown, an attorney with board experience in data privacy and information technology law.

  • Maintaining Legacy Data

Cybersecurity and data privacy concerns have discouraged the practice of keeping vulnerable data indefinitely. The EU's data privacy regime makes this concern particularly important. Keeping data has its own costs.

Data on older database and file shares may not have proper meta data, making it difficult to determine what should be disposed of.

One of the costs of electronic discovery is liability. C-level is putting increased pressure on e-discovery staffs to discard email archives that they are no obligated to retain. Preservation itself can be expensive, particularly where encryption is in place. The goal is to meet regulatory requirements and follow best practices, and only take steps that are necessary for individual legal matters.

The explosive increase in data means that data storage is not cheap. There is often an incredible amount of redundancy in the data retained by a company. Searching through the data will not necessarily be inexpensive. Useless data that has no business value and is not subject to a legal hold should be remediated.

  • Deleting Legacy Data

A company should have a defensible process for data remediation. Large organizations often rely on preservation in place when responding to legal holds, according to Brown. Different steps should be taken to preserve data from different applications.

Different regulatory agencies impose different burdens. The SEC, FTC, and other agencies have very specific requirements that businesses must meet. Castro said that at Hess, their downstream organizations had very heavy regulatory obligations. Data had to be retained for a long time.

Diana said that in the US the risk calculus is different than in other countries, because the focus is on the risk from legal holds. In Europe, preservation is less of a concern. Data privacy rights are considered more important.

Brown said that there was no set retention requirement for any given country, but periods driven by individual regulatory agency requirements.

Castro said one's prime paradigm should be following a company's own retention policy, which the courts and agencies would hold a company responsible for following. The United States advocates wide open discovery at the expense of privacy.

Brown noted that many employees don't understand the difference between retention and preservation. Data subject to a legal hold should be preserved; records are kept subject to a retention policy.

  • Creating a Data Remediation Program

Senior management must agree to the disposal of data. This is the first step before one can obtain funding for an effective program. There should a return on investment (ROI) shown through the reduction in cybersecurity risk, compliance with privacy laws or regulatory obligations.

An analytics tool can provide some context for data, and show relationships between individuals.

IT should be consulted, since they often have the budget that will allow the remediation program to be implemented. A law firm or consultant may help drive the process.

  • Hard Copy Documents (Boxes)

There are a few players that dominant the area of hard copy storage, and they made the process more expensive than it used to be, according to Castro. Physical records under a company's custody and control may reside in many different locations and be managed by different companies which follow different practices. Physical records usually aren't organized by custodian.

Diana noted that undetailed indices often make the remediation of boxes in off-site storage very difficult. However it may useful to simply know when a box was sent off-site. Boxes that were stored off-site more than 20 years may be assumed to be full of stuff that is unneeded. Sampling may be performed on some boxes to draw some conclusions about the importance of the records. One should not sample only to confirm that an index is correct. It's necessary to really judge the importance of the contents of boxes.

  • Hard Drives

For the implementation of legal holds, it's necessary to keep track of who used hard drives. There should be a policy in place for IT to re-purpose hard drives after checking to see if anything needs to preserved. Drives should be authenticated with unique ID numbers.

Castro uses a third party vendor to dispose of hard drives. Data must be truly destroyed.

  • Back-Up Tapes

Back-up tapes hold a lot of dispersed data - not data from just one custodian. It's hard to get rid of data stored on tapes. Data on back-up tapes may not be unique. Brown observed that back-ups are not always used just for disaster recovery. They may for instance be accessed to respond to legal holds. Back-up tapes will have snap shots from a particular point in time.

Castro said that Hess only uses back-up tapes for emergency purposes. It has not been required in a legal case to provide data from tapes - in part because it follows a consistent policy as to how they are used.

  • Email Archives

Financial institutions in particular rely on email archives as a general back-up. According to Diana it's very hard to remove data from email archives. The risk associated with getting rid of email archives is very high. It's common for a user to store their own records in email archives. Diana noted that he is working with many clients these days to remove data from email archives.

Email archives may have to be WORM compliant so that the data can't be altered. Journaling on the server level may lead to emails be retained even after users have deleted them.

Brown noted that there is concern that data migration in Office 365 may not meet regulatory requirements for the preservation of metadata.

Castro noted that Hess follows a policy of deleting emails automatically after a certain time period, and also has a policy against saving inappropriate emails in its systems.

  • The Cloud

Some cloud providers can't allow for data to be searched and exported very well. Before sending data to the cloud one should have a plan as to how data will be purged on a regular basis. The courts will not look favorably upon parties that can't extract discoverable data from the cloud.

  • 2015 Amendments to FRCP

The amendments make it easier for companies to dispose of data. Under Rule 37, companies will still be subject to sanctions if they can't find data. Brown noted that most companies will reply on preservation in place, because doing otherwise is cost prohibitive.


 
 

Here's a continuation of my postings about the Electronic Discovery Institute's online e-discovery certification program, that you can subscribe to for just $1. I last blogged about this program on August 19, 2017. Go to https://www.lawinstitute.org/ to sign up for it.

Tonight I took the course on data processing and management. The course is taught by Amy DeCesare, a VP for litigation management at Allied World; Veronica Gromada, counsel for Walmart's litigation support group; and Ashish Prasad, counsel for eTERA Consulting.

Introduction

Data processing refers to the export of potentially relevant files from a data set that has been collected and pre-processed. Pre-processing refers to the removal of unwanted files from a data set. Processing is a precondition for the hosting for active files in a format that can be reviewed. The processing of data turns it into something that attorneys can use for well managed review. Processing must be done in a legally defensible manner that is cost effective for clients.

Prasad said that these questions should be asked prior to beginning processing:

1. How should meta data be handled?

2. Can the chain of custody be proven on an individual file level?

3. How will exclusions be handled, as for example with system files?

4. Will de-duplication be performed? Will it be done on a custodian level or a global level?

5. What type of data culling will be done?

A large set of data must be brought down to management level. On an organization's systems, there are often multiple, duplicative data sources. Before processing, one should consider if one can cull the data by date range. One should only work with data pertinent to a matter. De-NISTing should be performed to remove unreviewable system files.

DeCesare said to perform Early Data Analytics before processing to see what one has. Are there encrypted files? Are there files in foreign languages? Data should be handled strategically because processing can be expensive.

Data Conversion

The point of processing is to index and format the data for attorney review. One has to consider how native files are to be produced. A careful record should be maintained showing how data was processed.

Prasad said there were five steps in data conversion:

1. Data arrives.

2. Data is pre-processed.

3. Data is imaged, or remains as native files.

4. Load files are prepared and data is loaded to Relativity or another review platform.

5. Production data is delivered.

Processing Funnel

The processing funnel is a graphic representation commonly used for processing.

During process data is extracted, filtered by target users; filtered by date ranges; keyword searching is performed; and then data is output into a review tool. Then privilege review takes places, before a production is sent to the opposing party.

The funnel begins with a vast set of possibly useful data that must be distilled for what is needed in discovery. One may begin with a custodian interview. Key data sources should be outlined, and one should prioritize their data sources. A top tier custodian may reveal smoking gun documents, that will let one glean key terms or concepts that will appear repeatedly. A well organized custodian may have retained documentation related to a legal dispute in an Outlook .pst file or a SharePoint site.

Culling

Culling is defensibly reducing the data that needs to move on for review. It may not be necessary to collect from every custodian that has been put on legal hold. An attorney must be able to defend the actions of the electronic discovery technician, and explain them in court. The level of culling will depend on what has been agreed upon by the parties. At the Rule 26(f) meet and confer the parties may agree on discovery beimg limited to certain subjects, date ranges, and custodians from which data may be collected. Information from custodian interviews should be used for the culling process. Data should be clustered together, and divided into workstreams in which one can prioritize the documents that are likely to be the most relevant hits. Attorneys can review the documents that seem to be most relevant and identify hot documents that can inform the manner of subsequent searches.

A 100 GB hard drive may hold 5 million pages of documents. Culling is necessary to reduce the number of documents to be reviewed. 80-90% of documents may be culled out prior to review in a typical seenario.

Selecting & Managing Processing Vendors

Consider what one's in-house capabilities are. Can an IT team assist with the collection of data? Subject matter experts may not be familiar with electronic discovery, and a special e-discovery expert may need to able to testify about the process. A firm or company should develop a relationship with an e-discovery vendor that has adequate cyber security measures in place. There are different models of pricing.

Prasad said to consider these factors when selecting a processing vendor:

1. Its experience and qualifications

2. Insurance capacities

3. Reputation and references

4. Reporting capabilities with respect to the processing workflow

5. Does the vendor have staff that can testify about best practices that were conducted in processing

6. Industry standard certification staff

7. Appropriate facilities and technology

The counsel is the overall project manager. Despite their lack of technical knowledge, they should know concepts such as the de-NISTing of files.

Model Rule 5.3 has been amended to provide that a lawyer must make reasonable efforts to ensure that a non-lawyers perform services in a manner that is compatible with the attorney's legal obligations. The attorney must confirm that the vendor is performing their tasks competently.

Master Services Agreement & Service Level Agreement

A master servicing agreement may accompanied by a project specific template, or statement of work. The MSA is a document that governs the professional relationship between the firm and its vendor. It discusses the purpose of the engagement, its time period, rights duties & obligations, and the services the parties expect to receive.

In-Sourced Processing Technology

Different technologies will been needed for collection and processing. It's important that the tools are easy to use. Gromada said that bots may be used to as automated data dictionaries to scope out what is available in an organization's system.

Processing Pricing Models

Prasad identified three pricing models:

1. Traditional - per unit or tiered pricing

2. Project base pricing.

3. Subscription model, which involves billing on a yearly basis

Per unit processing does not allow for cost certainty, as the subscription model does. Gromada noted that many companies will make flat fee agreements but there are many different arrangements. Volume is very important and key driver of costs. A volume discount pricing model may make sense for some companies. Data may not be need to be hosted over long periods of time.

The traditional model for processing is an a la carte model - where one can pay for OCRing, image conversion and other services per GB. Some arrangements may have a fixed fee for data ingestion.

The Goals of Processing

The essential goal is to index and format data for review. Legal professionals should confer and come up with a sound approach that allows all parties to confirm that their legal obligations have been fulfilled. Any processing project should have a robust validation and QC phase.


 
 
  • Aug 27, 2017

The EDRM (the organization behind the Electronic Discovery Reference Model), now part of Duke Law School, has published an ESI checklist.

The checklist is a simple one page form divided into seven sections:

1. File System Locations

2. User Removable Storage and Portable Devices

3. Server Software

4. Public/Semi-Shared Sources

5. Security / Access Components

6. Backups

7. Retired Workstations, Mail Servers, Devices

The list is simply a way for someone collecting data to confirm that they are inquiring about all likely data sources. It's useful to have a checklist from an authoritative source, to confirm that one had done due diligence.

Note the only brand names mentioned on the list (either than the ubiquitous Google Docs and Drop Box) are Golden Eye and Web Trends. These are given as resources for tracking monitoring activity on computers and web sites. Golden Eye is a program aimed at the general user that allows them to spy on someone using a computer - detecting which files and folders are used; which websites are visited; which user names & passwords are entered; giving an individual (or a company) to take a screen grab at fixed intervals. Are companies using this software on employee PCs? I wonder.

WebTrends is a large data analytics company that helps companies track how their web sites are used. It uses software that analyzes captured web data and generates reports that are often in PDF or .csv format.


 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page