top of page

Today I was at the Legal Tech show at the Hilton in Manhattan. I attended a panel with Magistrate Judge Andrew Peck of the U.S. District Court for the Southern District of New York; Judge Elizabeth LaPorte of the U.S. District Court for the Northern District of California, and Magistrate Judge James Francis of the U.S. District Court for the Southern District of New York. One year ago, I attended a panel at Legal Tech involving the same judges. Tara Emory of Driven hosted the panel this year.

Judge Peck placed great emphasis on the importance of a 502(d) order to guard against the waiver of privilege when producing documents. He described it as a 'get of jail free card'. The proposed order available on his site provides the broadest possible protection under 502(d) in paragraph 1, and paragraph 2 provides for sufficient time to review for privilege. He said it was akin to malpractice not have a 502(d) order in place.

Judge LaPorte discussed how the fear of making a mistake can lead parties to over preserve data. If parties put certain topics off limits at a case management conference they won't be questioned later on the failure to preserve such data.

Judge Peck stressed the importance of showing the relationship between search terms and discovery requests. There may be no way to produce a company's database, but it may be possible to export aspects of the database's content. Judge Francis suggested that a 30(b)(6) deposition might be necessary to determine how to produce a database, but Judge LaPorte preferred using a sample report to determine the scope of production rather than resorting to a deposition.

Judge Francis discussed how it might be possible for parties to cooperate on the use of search tools - even to the extent of letting the other side train a TAR system. Judge Peck noted that while here may be disclosure of collection, search, and review strategies, parties zn exchange these without needing to disclose them for future litigation.

Judge LaPorte talked about how tiered discovery involves prioritizing review of certain documents, not necessarily limiting the review. ESI agreements can specify particular custodians for review. Judge Peck said that parties should always let the court know what is happening in tiered discovery beforehand, but this approach will commonly be approved and save time, but not if parties have to later justify their steps to the court. Judge Francis explained that the tiered approach should narrow what is needed to reviewed at each successive step.

Judge Peck noted that for cross border discovery under EU data regulations, the implementation of a hold constitutes processing. The General Data Protection Regulation has provided damages equal to 4% of gross global revenue for certain violations. This will enter into effect in 2018. Judge Peck felt that in the past courts have held the attitude that damages assessed under past EU regulations would be minimal or unlikely to be implemented at all. He said that parties should not use EU data privacy laws as an excuse not to make productions at all.

Judge Peck discouraged attorneys from filing 25 page motions on why discovery requests are overboard.

In forming agreements, Judge Francis said parties should go for the low hanging fruit first and settle issues that are easy to agree on, in order to build momentum. Judge Peck criticized attorneys for being reluctant to involve their clients in discovery issues. Judge LaPorte recommended the discovery checklists she has for her jurisdiction. I believe she had this in mind: https://www.cand.uscourts.gov/filelibrary/466/Initial%20Discovery%20and%203rd%20Party%20Data%20Assessment%20Checklist.pdf

Judge Francis referred to the Progressive Casualty Ins. Co. v. Delaney (D. Nev. May 20, 2014) as an example of how not handle agreements on document searching. In this case, from a set of 1.8M documents, the parties' original agreement led to them having to review 565K documents manually. The producing party panicked and made a unilateral decision to use TAR. Judge Francis suggested that protocols should provide for a meet and confer when parties needed to depart from the protocol itself. He thought that the court was incorrect to treat the protocol as ironclad, and believes such agreements should by their nature be flexible. Judge LaPorte found fault with the producing party in Progressive for not following a vendor's best practices.


 
 

Here's another installment of my outline of Electronic Discovery and Digital Evidence in a Nutshell, the second edition of the West Academic guide to electronic discovery law in the United States authored by Judge Shira Scheindlin (the judge in the Zubulake v. UBS Warburg) and members of the Sedona Conference. An outline of the previous chapter was posted on December 20, 2016.

CHAPTER V – SEARCH AND REVIEW OF ESI

A. SEARCH METHODS a. Filtering on select criteria – search terms, search terms with Boolean operators, date ranges, file sizes, file types, etc. b. Victor Stanley v. Creative Pipe, (D. Md. 2008) – where party did not take reasonable steps to identify privileged documents where they could not identify keywords used, qualifications of person who designed the search, and whether results were analyzed to check reliability c. William A. Gross Constr. v. American Mfrs. Mut. Ins. Co. (S.D.N.Y. 2009) court ordered parties to meet-and-confer where they could not agree on keywords to search party data on non-party server, rather than designating on its own.

B. USE OF TECHNOLOGY FOR SEARCH AND REVIEW a. Reasonable process for searching large volumes of data: i. Collect data from client using filtering process. ii. Sample data to determine proper scope of data to be processed vis-à-vis likelihood of responsive documents and corresponding costs and burden. iii. Load into electronic review platform for analysis. iv. Establish selection criteria. v. Filter data using selection criteria vi. Perform review for responsive, privilege and production determinations. b. Craig Ball – Step by Step of Smart Search i. Start with Request for Production ii. Seek input from key players – what event triggered conversation? iii. Look at what you’ve got and the tools you’ll use. Search tool should be able to review content of container files, and identify exceptional files that can’t be searched. iv. Communicate and Collaborate – tell other side the tools and terms you’re using. Do two rounds of keyword search giving the other party the chance to review the first production before proposing additional searches. v. Incorporate misspellings, variants and synonyms. 1. Fuzzy searching. vi. Filter and deduplicate first vii. Test, test, test – false hits may turn up in system files. – test keywords against data that is clearly irrelevant. viii. Review the hits – preview search hits in context in a spreadsheet with 20-30 words on each side of the hit. ix. Tweak queries and retest – achieve better precision without effecting recall. x. Check the discards – examine closely when refining queries; more random later. c. Search Tips i. Tier ESI and have keywords for each tier. ii. When searching email for recipients each by the email address, not the name.

C. TECHNOLOGY ASSISTED REVIEW a. How does it work? i. Computerize system that harnessing human judgments of subject matter experts on a smaller set of documents, then extrapolates judgement to the remaining document collection. ii. Knowledge engineering – rule based or linguistic approach. Capture expertise of SME. Rules discriminate between responsive and non-responsive documents. iii. Machine learning – predictive coding – need training set of responsive and non-responsive documents. Can be passive or active. 1. Passive expert select training documents using judgmental or random selection. 2. Active learning – after SME selects seed set, computer selects documents for SME to review and add to training set. Finds documents whose relevance is least certain. iv. Grossman-Cormack TREC study. Used F1 scores – harmonic mean of recall and precision – average weighted to the lower of the two measures. Two best TAR teams did twice as good as human judges. b. Have Courts Accepts TAR as a Search Methodology? i. Da Silva Moore v. Publicis Groupe (S.D.N.Y. 2012) Use of predictive coding appropriate given 1. Parties agreement 2. Large amount of ESI - >3M documents. 3. Superior to manual review or keyword searches 4. Need for cost effectiveness and proportionality under 26(b). 5. Transparent process. Staging of discovery by most relevant sources good way to control costs, must judges willing to grant discovery extension to allow for staging. Parties should consider ‘strategic proactive disclosure of information’ – tell other side who your key custodians are. Helpful to have ediscovery vendors at hearings ii. FHFA v. HSBC (S.D.N.Y. 2014) party permitted to use predictive coding even though opposing party showed a document was not produced that was in a parallel case. No one should expect perfection from the document review process. c. Can Responding Parties Be Required to Use TAR? i. Kleen Prods. V. Packaging Corp. (N.D. Ill. 2012) Sedona Principle 6, parties are best situated to evaluate procedures to preserve and produce their own ESI. Court urged parties to come up with way to refine Boolean search rather than ordering a TAR search. ii. EORHB v. HOA Holdings, (Del. Ch. 2012) court ordered both parties to use TAR sua sponte. but later withdrew the order. d. How Involved Should the Court Get in the TAR Process? i. Most of the courts that have addressed TAR issues have either entered orders stipulated by the parties, advised the parties to negotiate a protocol, or allowed a party to proceed with their own protocol, without prejudice to the opposing party challenging the adequacy of the results. ii. Independent Living Ctr. of S. Cal. v. City of Los Angeles (C.D. Cal. 2014) Should predictive coding protocol include a quality assurance phase? Plaintiff could insist it be performed if they paid one half of costs. iii. Judicial Modesty: Not an Oxymoron: Case for restraint in the electronic age Judge Francis – parties are entitled to use TAR, but whether the tool will produce reliable results is something that judges are ill equipped to determine. But judges should try to resolve e-discovery issues at pretrial conferences; phase discovery and order sampling.

D. NEGOTIATION AND AGREEMENT ON SEARCH PROTOCOLS a. Paul/Baron Information Inflation Process: i. Meet and Confer to identify methods to narrow scope of searches. ii. Conduct initial searches on meet and confer parameters. iii. Share initial search results and adjust parameters. iv. Repeat process in an iterative fashion until mutually agreed time or mutually agreed cap on responsive documents reached. b. In re Biomet Hip Implant (N.D. Ind. 2013) Biomet through keyword culling and de-duping narrows documents from 19M to 2.5M then used predictive coding on the set of 2.5M. Plaintiffs move for predictive coding on all 19M documents. Court rules likely benefits outweighed burden and expense to Biomet. Court also refused disclosure of seed set. c. I-Med Pharma v. Biomatrix (D.N.J. 2011) party relieved of obligation to produce documents recovered from unallocated space after searching all data on system without limiting to certain custodians or time periods. d. Progressive Cas. v. Delaney (D. Nev. 2014) after search as per party agreement with search terms found 565K in 1.8M document set, party not allowed to use TAR to reduce set to 55K documents, had to produce all original hits. e. Edwards v. National Milk (N.D. Cal. 2013) Model Joint Stipulation and Order i. Definitions 1. Document Review Corpus – document remaining after exclusion of document types, dupes, system files, documents outside of date range. 2. Confidence level likelihood that sampling is accurate. 3. Estimation Interval – statistical error rate of a measured confidence level. ii. Protocol 1. Document Collection 2. Control Set - a random, statistically valid sampling of documents to estimate the number of responsive documents in the corpus. The control set sample shall be determined using a 95% confidence level and 2% estimation interval. 3. Seed Set and Initial Training – seed set includes responsive document identified by party and found with Boolean search terms. 4. Iterative Review and Further Training 5. Review - Review and iterative training will proceed until all identified documents have been reviewed and the system indicates that the remainder of the document review corpus is not likely to be responsive. 6. Validation - perform a validation test by reviewing a statistically valid and random sampling of unreviewed documents to confirm that the number of potentially responsive documents in the unreviewed corpus is statistically insignificant. the review process will continue until the validation test achieves a 1% or less responsiveness rate.

E. VALIDATION OF SEARCH EFFORTS a. Must conduct a reasonable search, the cost and effectiveness of which meets the proportionality requirement of Rule 26(g). b. Recall is more important than precision. Precision is only a factor if it is very low – in the case of a data dump. c. Recall is difficult to measure. For many matters, it is neither feasible nor necessary to expend the effort to estimate recall with a margin of error of ± 5%, at a 95% confidence level— the standard required for scientific publication. d. As per Grossman-Cormack – even a perfect review by an expert with a recall of 100% is unlikely to achieve measured recall of 70% if performed by a second expert.


 
 

Here's another installment of my outline of Electronic Discovery and Digital Evidence in a Nutshell, the second edition of the West Academic guide to electronic discovery law in the United States authored by Judge Shira Scheindlin (the judge in the Zubulake v. UBS Warburg) and members of the Sedona Conference. Tonight's outline covers Chapter IV on the Collection of ESI. An outline of the previous chapter was posted on December 6, 2016.

IV. Collection of ESI

A. Searching All Appropriate Sources

- Relevant ESI may be preserved in place, but not necessarily physically collected.

- Peskoff v. Faber (D.D.C 2006) - adequate search not performed by collecting emails from hard drive. Must also collect emails from:

1. email account

2. email inboxes, sent items of other employees

3. should search for keyword on all files on a hard drive.

4. emails may be recoverable from slack space.

- Obligation to search all emails, does not mean that they all have to be reviewed. i-Med Pharma v. Biomatrix (D.N.J. 2011) broad search terms find 95M pages of data in unallocated file space. Court relieves of burden to review because:

1. burden outweighs value of the data.

2. no evidence that data was relevant.

3. sheer vast number of results.

B. Custodian Based Collection

- Custodian may not be aware of shared drives, cloud based applications, or informal back-up media. IT staff may be more adept at searching and extracting ESI, but have no knowledge of its content.

- BreathableBaby v. Crown Crafts (D. Minn. 2013) - court order to reopen discovery after party refused to search by agreed upon keywords and instead instructed custodians to search 'everything'.

- Nat'l Day Laborer Org. v. U.S. ICE (S.D.N.Y. 2012) re-do of search ordered where government agencies responding to FOIA request didn't record search terms, how it combined them, and whether it searched full text of documents.

- Procaps v. Patheon (S.D. Fla. 2014), Court granted Patheon’s motion for a forensic analysis of Procaps’ electronic media because it permitted its personnel to self-search for ESI without ever seeing Patheon’s discovery requests or without receiving a list of search terms from its counsel.

C. Role of Outside Counsel

- case law does not reject custodian based collection, but instead emphasizes the importance of supervision of the process by counsel.

- Phoenix Four v. Strategic Res. Corp., (S.D.N.Y. 2006) - counsel sanction when unproduced relevant ESI was discovered on partitioned section of server.

- Qualcomm v. Broadcom (S.D. Cal. 2008) - witness testimony at trial that highly relevant emails were not produced. Lawyers sanctioned when 46K emails and documents not produced and no substantial justification for failure to do so. Qualcomm did not est. that it searched computers and databases; after the trial Qualcomm did not conduct an internal investigation; organization has obligation to confirm that person who is testifying as the most knowledgeable person on a subject has that knowledge. Court found it likely that lawyers chose not to look in the correct locations, and did not press Qualcomm employees for the truth. Sanctions against attorneys later vacated, but not against Qualcomm.

D. Forensic Collection

- Unsupported suspicion that responsive ESI may be missing or had been tampered with is not sufficient justification to require forensic collection.

- “mere skepticism that an opposing party has not produced all relevant information is not sufficient to warrant” John B. v. Goetz (6th Cir. 2008).

- Forensic collection warranted when:

1. File has been deleted but recovered is possible.

2. ESI may have been tampered with.

3. important to show ESI usage activity and patterns.

4. Necessary to authenticate a file to show that represented time of creation is accurate.

- Ameriwood v. Liberman (E.D. Mo. 2006) mirror image ordered with computer alleged to have been used to commit the wrong.

E. Collecting Data from Nonparty Hosts

1. Websites and Social Media

- individuals may erroneously believe that the privacy settings on their social media accounts mean that the information is immune from discovery.

- EEOC v. Simply Storage (S.D. Ind. 2010) - scope of production of SNS (social networking sites):

a. SNS content is not immune because it is locked or private.

b. SNS content must be produced when relevant to a claim or defense but not everything must be disclosed

c. “allegations of depression, stress disorders, and like injuries do not automatically render all SNS communications relevant.”

- Romano v. Steelcase (Sup. Ct. Suffolk Co. 2010) - no reasonable expectation of privacy in SNS.

- Facebook Procedures for Parties to Produce FB Accounts:

a. Stored Communications Act prevents private parties from obtaining account contents with subpoenas. Can satisfy discovery requests by using 'Download Your Information' tool. FB may attempt to restore access to deactivated accounts but it can't recover deleted content.

b. FB may provide basic subscriber information when it is indispensable to a case and not within a party's possession, upon service of a valid federal or California subpoena.

2. Mobile Devices

- Technicians must temporarily take possession of devices and image the data. Problems stemming from this can be avoided by Company Owned Personally Enabled 'COPE' policies that allow for data to be access remotely by the employer for preservation and collection.

3. Cloud Computing

- online remote data storage - reducing the need for massive servers

- online remote backup - reducing need for large collections of backup media.

- cloud based applications - reducing the need for technical support staff.

F. Data Collection Checklist

1. Initial Steps

- identify types of records likely to be relevant to claims & defenses.

- identify custodians likely to possess or have knowledge of these records.

- identify information services personnel who can locate custodial and non-custodial sources.

- identify email admins; hardware support personnel, system maintenance, back-up tapes; application & data admins

- identify 3rd party service providers with data in custody & control.

2. Investigating Custodians

- Ask about hardware, mobile devices, email, productivity software, office management software, instant messaging, text messaging, collaborative software, voicemail, databases, server/mainframe applications, internet browsers, publicly available third party platforms (e.g. Facebook, Linkedin), organization-supported platforms (e.g. Yammer or internally created with SharePoint), blogs, email.

- retention of final and draft documents on drives or servers.

- retention of downloaded files

- retention of files on mobile device

- use of removable media

- use of third party storage sites (e.g. Dropbox).

3. Investigating the Hardware Environment

- determine on the enterprise level where records are: use of desktops, laptops, tablets; networks with servers; use of mainframes; use of mobile devices; use of removable media; and use of voicemail systems.

4. Investigating Backup Systems and Archives

- existence of backup media

- frequency of data backups

- schedules for overwriting backup media

- locations of backup media

- process to restore backup media

- backup media solely for disaster recovery.

- existence of archived historical data.

- retention periods for archived historical data.

5. Investigating Applications and Databases

- CAD applications replacing blueprints

- quality assurance applications; financial records; supplier bidding and purchasing applications; product distribution and sales databases; litigation related databases; government relations databases; etc..

6. Investigating Electronic Communication Systems

- overview of systems structure and capabilities

- volume of traffic

- maintenance and retention of message logs.

- retention period for unread messages.

- autodelete settings.

- frequency of overwriting deleted items.

- shared systems with service providers, suppliers or corporate family.

- policies regarding storage of email


 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page