top of page

LITIGATION SUPPORT TIP OF THE NIGHT

Featured on the ACEDS blog.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information. This policy is subject to change at any time. The owner is not an attorney, and nothing posted on this site should be construed as legal advice. Litigation Support Tip of the Night does not provide confirmation that any e-discovery technique or conduct is compliant with legal, regulatory, contractual or ethical requirements.

See my post on Running Regex Searches With a Grep Utility on the ILTA litigation support blog.

New tips for paralegals and litigation support profesionals are posted to this site each week. Click on the blog headings for better detail.

See How-To Videos on my YouTube channel.

Binomial Calculators and TAR

Sean O'Shea
Nov 25, 2015
1 min read

This blog has previously discussed Catalyst's TAR for Smart People on the night of October 5, 2015 , and noted on the night of November 13, 2015 , its recommendation to use the Raosoft Calculator to determine the sample size needed for predictive coding. John Tredennick's guide also recommends using a Binomial Calculator to estimate the confidence interval for how accurate the percentage of relevant documents found in a sample set will be in showing the actual number of relevant documents. So if you have a sample set of 1000 documents, and you find 50 relevant documents, and the complete document set is 2,000,000, you're dealing with a richness level of apparently 5 per cent, and extrapolating that supposed percentage to the full set, we come up with an exact guess (a point estimate) that there are 100,000 relevant documents in the total population. The binomial calculator lets us set a confidence interval of a likely range in which the actual number of relevant documents will fall.

A binomial calculator for confidence intervals can be found here: http://statpages.org/confint.html . The number of relevant documents found in the sample set is entered as the numerator and total sample size is the demoninator. After clicking 'compute' you find that the Proportion x/N of relevant documents in the sample is 5%. You get the, 'Exact Confidence Interval around Proportion' by multiplying the values in a given range by the size of your document population. In my example the confidence interval would be 74,600 to 130,800 relevant documents in the total set of 2,000,000 documents. See the below screen grab of the calculator.

bottom of page