The Obsolescence of Seed Set TAR

Predictive coding or technology assisted review is often regarded as a process which involves a static 'seed set' that is used as a basis on which to categorize a full document set. A group of documents is identified through manual review. The software trains based on that seed set, which should contain documents which represent key concepts. A QC is done to find an acceptable 'overturn' rate - a low percentage of documents that must be re-categorized by a human reviewer - an indication that an effective seed set has been chosen. A report can be prepared to identify which seed set documents lead to the most overturns, and may need to be removed.


The obsolescence of this type of review (known as TAR 1.0), is evident in Relativity's decision to deprecate sample-based learning, a form of seed set based TAR. After September 2021, it will no longer be possible to run sample-based learning projects in Relativity.



After this September, Relativity will direct its clients to use Active Learning, a TAR 2.0 review process, which uses continUous active learning (CAL) to improve machine learning continuously as manual reviewers make coding decisions.