top of page

Note that in Relativity it's possible to find a document with a count of only one in the results after running the tally mass operation on the textual near duplicate group field. The reason why is that during an incremental population for the structural analytics set, a new document may have been added that is a near duplicate of, but larger than a previously loaded document. In this case the new document will be marked as the principal document, and the original that it is a near duplicate of will be orphaned.

 
 

When setting up an analytics index, the option to optimize the training set will perform the following operations:


1. remove conceptually irrelevant documents.

2. remove documents which are too long or too short to serve as good examples.

3. remove spreadsheets or documents which consist of predominately numeric data.

4. remove log files.

5. remove documents with text resulting from processing errors.


Word count, word uniqueness, punctuation, and words with a high character count are evaluated to determine what must be removed.

 
 

After running the structured analytics set for language identification in Relativity, you may see many documents for which the Docs_Languages::Language field is set to 'Other'.


ree

There are several reasons why this can happen. The analyzed document may consist entirely of numeric data:


ree

. . . documents may be in a language other than the three most common languages Relativity analytics identified, or documents may be in a language which is not one of the 173 languages that Relativity can identify.


My own testing has shown that Relativity will list the language as 'Other' for documents which consist of long lists of names:


ree

. . . or entirely of graphics:

ree









 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page