top of page

TAR for Smart People Outline - Chapter 9


Here's another installment in my outline of John Tredennick's 'TAR for Smart People'. I last posted an installment on December 18, 2016. This night's installment is on Chapter 9 - Comparing Active Learning to Random Sampling - Using Zipf's Law to Evaluate Which is More Effective for TAR.


A. Schieneman-Gricks Study vs. Grossman-Cormack Study

1. Judgmental Seeds (selecting via continuous active learing and contextual diversity) superior to random seeds as per Catalyst.

2. Cormack-Grossman and Ralph Losey believe random sampling is not as effective.

3. OrcaTec - thinks random sampling leads to bias.

4. Contextual Diversity models the entire document population.

B. What is Contextual Diversity?

1. Contextual Diversity Algorithm identifies documents based on how different they are from ones already seen.

C. Contextual Diversity: Explicitly Modeling the Unknown

1. Will select document containing highest percentage of terms that are not included in documents already reviewed.

D. Zipf's Law

1. You can expect the most frequent word in a large population to be twice as frequent as the second most common word, three times as frequent as the third most common word, and so on.

2. The diagram below depicts random sampling. Each bubble is a subtopic in the document set. The grid shows how random sampling does not necessarily get sample from each subject.

3. This diagram depicts contextual diversity approach. The red dots are seed documents selected from each of the topics.

The subtopics which are covered by large groups of documents are not over sampled.



Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page