Last month the EDRM published its Technology Assisted Review (TAR) Guidelines which it developed with Duke Law School.
Here's a brief outline of the 50 page long guidelines:
I. Defining Technology Assisted Review
A. The TAR Process
1. Assemble a team: Service provider; Software provider; Workflow expert; Case manager; Lead attorney; Human reviewers
2. Collection and Analysis
a. Algorithm analyzes the relationship between words and characters.
3. Train the Computer to Predict Relevancy
a. Synthetic documents may be used.
b. TAR software may find documents to be used to classify others.
4. Quality Control and Testing
a. Identity relevant documents and then see if TAR finds them.
5. Training Completion and Validation
a. Rank documents according to relevancy.
b. TAR 1.0 - software trained on a random subset of relevant and non-relevant documents selected at the beginning and then used on the remaining unreviewed documents.
c. TAR 2.0 - From the outset, the software continuously analyzes the entire document collection and ranks the population based on relevancy. Human coding decisions are submitted to the software, the software re-ranks the documents,and then presents back to the human reviewer additional documents for review that it predicts as most likely relevant.
d. Review stops when:
i. a certain recall rate is reached
ii. when software only returns non-relevant documents.
iii. when a certain number of relevant documents has been found.
e. Recall - percentage of relevant documents found.
f. Precision - percentage of actually relevant documents in set determined by TAR to be relevant.
II. TAR Workflow
A. Foundational Concepts & Understandings
a. Feature extraction algorithms - document content identified and related to other documents.
b. Supervised machine learning algorithms - human reviewer trains software to recognize relevance.
B. TAR Workflow
1. Select the Service and Software Provider
a. Have they provided affidavits in support of their workflow for past cases?
b. Do they have an expert that can discuss TAR with the opposing parties or the Court?
c. Will rolling productions effect the workflow?
2. Identify, Analyze, and Prepare the TAR Set
a. Culling criteria based on file types; custodians; date ranges; and search terms.
c. Index is based not on native files but on extracted text.
3. Human Reviewer Prepares for Engaging in TAR
a. A team of 15 human reviewers may produce more accurate results than two lead attorneys.
b. Software may allow the use of more than one relevance topic tag.
4. Human Reviewer Trains Computer to Detect Relevancy, and the Computer Classifies the TAR Set
a. Training sets created by random sampling may have to be larger than those formed by other methods.
5. Implement Review Quality Control Measures
a. Decision Log - record of relevancy decisions.
b. Sampling - human reviewers' decisions checked by lead attorney.
c. Reports - where coders disagree on relevancy
6. Determine When Computer Training is Complete and Validate
a. Training Completion based on:
i. Sample-based Effectiveness Estimates
ii. Observing sparseness of relevant documents returned during active learning.
iii. Compare different predictive model behaviors.
iv. Compare TAR 1.0 and TAR 2.0 processes.
i. Consider Rule 26(b) proportionality considerations when setting target recall level.
7. Final Identification, Review and Production of the Predicted Relevant Set
a. Separate review of documents with only numbers or illegible text.
b. Address privilege, need for redaction, and other issues.
8. Workflow Issue Spotting
a. Extremely low or high richness may indicate TAR is not appropriate
III. Alternate Tasks for Applying TAR
A. Early Case Assessment
1. find ESI that is needed for closer review.
B. Prioritization for Review
C. Categorization by Issues
D. Privilege Review
E. Review of Incoming Productions
1. Especially data dumps.
F. Deposition and Trial Preparation
G. Information Governance and Data Disposition
1. Find data subject to retention policy.
2. Find data for defensible deletion.
3. Segregate PII.
IV. Factors to Consider When Deciding Whether or Not to Use TAR
A. Should the Legal Team Use TAR?
1. Are the documents themselves appropriate? TAR does not work well with:
a. Exports from structured databases.
b. Audio/video/image files
c. Hard copies with poor OCR.
2. Is the cost and use reasonable?
B. Cost of TAR vs. Linear Review
1. Document review usually consists of 60-70% of discovery costs.
2. QC and privilege review may still be expensive when TAR is used.
Later this year the EDRM will publish its best practices for technology assisted review which will explain in what situations it is best to use TAR.