top of page
LITIGATION SUPPORT TIP OF THE NIGHT
The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information. This policy is subject to change at any time. The owner is not an attorney, and nothing posted on this site should be construed as legal advice. Litigation Support Tip of the Night does not provide confirmation that any e-discovery technique or conduct is compliant with legal, regulatory, contractual or ethical requirements.
New tips for paralegals and litigation support profesionals are posted to this site each week. Click on the blog headings for better detail.
Search
- Feb 8, 2019
Last month the EDRM published its Technology Assisted Review (TAR) Guidelines which it developed with Duke Law School.
Here's a brief outline of the 50 page long guidelines:
I. Defining Technology Assisted Review
A. The TAR Process
1. Assemble a team: Service provider; Software provider; Workflow expert; Case manager; Lead attorney; Human reviewers
2. Collection and Analysis
a. Algorithm analyzes the relationship between words and characters.
3. Train the Computer to Predict Relevancy
a. Synthetic documents may be used.
b. TAR software may find documents to be used to classify others.
4. Quality Control and Testing
a. Identity relevant documents and then see if TAR finds them.
5. Training Completion and Validation
a. Rank documents according to relevancy.
b. TAR 1.0 - software trained on a random subset of relevant and non-relevant documents selected at the beginning and then used on the remaining unreviewed documents.
c. TAR 2.0 - From the outset, the software continuously analyzes the entire document collection and ranks the population based on relevancy. Human coding decisions are submitted to the software, the software re-ranks the documents,and then presents back to the human reviewer additional documents for review that it predicts as most likely relevant.
d. Review stops when:
i. a certain recall rate is reached
ii. when software only returns non-relevant documents.
iii. when a certain number of relevant documents has been found.
e. Recall - percentage of relevant documents found.
f. Precision - percentage of actually relevant documents in set determined by TAR to be relevant.
II. TAR Workflow
A. Foundational Concepts & Understandings
1. Algorithms
a. Feature extraction algorithms - document content identified and related to other documents.
b. Supervised machine learning algorithms - human reviewer trains software to recognize relevance.
B. TAR Workflow
1. Select the Service and Software Provider
a. Have they provided affidavits in support of their workflow for past cases?
b. Do they have an expert that can discuss TAR with the opposing parties or the Court?
c. Will rolling productions effect the workflow?
2. Identify, Analyze, and Prepare the TAR Set
a. Culling criteria based on file types; custodians; date ranges; and search terms.
c. Index is based not on native files but on extracted text.
3. Human Reviewer Prepares for Engaging in TAR
a. A team of 15 human reviewers may produce more accurate results than two lead attorneys.
b. Software may allow the use of more than one relevance topic tag.
4. Human Reviewer Trains Computer to Detect Relevancy, and the Computer Classifies the TAR Set
a. Training sets created by random sampling may have to be larger than those formed by other methods.
5. Implement Review Quality Control Measures
a. Decision Log - record of relevancy decisions.
b. Sampling - human reviewers' decisions checked by lead attorney.
c. Reports - where coders disagree on relevancy
6. Determine When Computer Training is Complete and Validate
a. Training Completion based on:
i. Sample-based Effectiveness Estimates
ii. Observing sparseness of relevant documents returned during active learning.
iii. Compare different predictive model behaviors.
iv. Compare TAR 1.0 and TAR 2.0 processes.
b. Validation
i. Consider Rule 26(b) proportionality considerations when setting target recall level.
7. Final Identification, Review and Production of the Predicted Relevant Set
a. Separate review of documents with only numbers or illegible text.
b. Address privilege, need for redaction, and other issues.
8. Workflow Issue Spotting
a. Extremely low or high richness may indicate TAR is not appropriate
III. Alternate Tasks for Applying TAR
A. Early Case Assessment
1. find ESI that is needed for closer review.
B. Prioritization for Review
C. Categorization by Issues
D. Privilege Review
E. Review of Incoming Productions
1. Especially data dumps.
F. Deposition and Trial Preparation
G. Information Governance and Data Disposition
1. Find data subject to retention policy.
2. Find data for defensible deletion.
3. Segregate PII.
IV. Factors to Consider When Deciding Whether or Not to Use TAR
A. Should the Legal Team Use TAR?
1. Are the documents themselves appropriate? TAR does not work well with:
a. Exports from structured databases.
b. Audio/video/image files
c. Hard copies with poor OCR.
2. Is the cost and use reasonable?
B. Cost of TAR vs. Linear Review
1. Document review usually consists of 60-70% of discovery costs.
2. QC and privilege review may still be expensive when TAR is used.
Later this year the EDRM will publish its best practices for technology assisted review which will explain in what situations it is best to use TAR.
Today I participated in a webinar hosted by ACEDS and conducted Thomas Gricks and Jermey Pickens of Catalyst, entitled, Just Say No to Family Batching in Technology-Assisted Review. Gricks and Pickens are the author, along with Andrew Bye, of a paper entitled, Break up the Family: Protocols for Efficient Recall-Oriented Retrieval Under Legally-Necessitated Dual Constraints. Gricks, et al. challenge the standard notion that since families of documents are produced together they should also be reviewed together. The authors advocate a 'broken family' review protocol, and 'dual phase workflow' with an initial expedited review for relevancy.
TAR algorithms will be more effective when trained with individual documents rather than complete families. Catalyst employed a continuous active learning protocol. A Full Family continuous active learning protocol will pull all documents in a family into the review queue irrespective of whether or not they are highly ranked. This is the approach favored by most attorneys.
In a Positive Family protocol any time one relevant document from a family is found to be relevant, any documents from the same family found to be non-relevant are not re-reviewed.
In an Individual Padded continuous active learning protocol, once a relevancy level is determined for any one document in a family, the rest of the documents in the family are added to the review queue.
Catalyst got results from eight different e-discovery projects. Its study shows how much additional review is needed to achieve recall rates of 75% or 90%. Positive Family and Individual Padded continuous active learning are shown to be clearly more efficient than Full Family review.
Phased continuous active learning involves reviewing documents first only for relevance, and removing the entire family from the queue when any one document has been determined to be relevant. In the second phase every document family with a relevant document is reviewed for both relevancy and privilege This approach is superior to both Full Family and Individual Padded continuous active learning.
bottom of page