top of page

After saving a conceptual analytics index in Relativity, one of the options on the console will be to 'Show Index Statistics'.


ree

An index statistics report will display several fields of interest, but pay particular attention to the 'Average Document Size in Words' field. This lists the mean number of words in each document in the training data set. The value should fall between 120 and 200. A number below this range may indicate that one of the folllowing problems exists in the training data source:

  1. Very short documents have been included.

  2. The extracted text contains errors.

  3. The saved search did not include long text fields.

If the value is 10 or lower, don't proceed without replacing the training data source first.

 
 

Note that the RelativityOne Analytics Guide (available here) provides a guideline for when it will be necessary to add new documents to a training set. With as many as 5,000 new documents it may not be necessary to update the training data source, if you have a total of more than one million documents in the data source. However, if 100,000 documents are added, the training data source set should be updated for the analytics index since the likelihood that the new addition will contain new concepts increases.


This is a useful rule of thumb, but the key criterion is whether or not the newly added documents differ significantly from the data source for which the analytics index was already created.



ree

 
 

Note that for classification analytics indexes used for assisted review, Relativity recommends that a data source (the searchable set of documents that conceptual analytics will be run on) contain no more than 9 million documents. See this post on the Relativity site, which specifies that no more than 9 million documents be in the saved search that is selected on the Analytics profile.


ree

 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page