Finding Textual Exact Duplicates
top of page

Finding Textual Exact Duplicates


If you're a Relativity admin, here's how to go about finding exact text duplicates.

1. Create a fixed length field with 255 characters.

2. In the Relational Field Properties section of the new field form, set the field to be relational, import blank values unchanged, and select the 'Textual Near Duplicates Relational View'.

3. Save the field.

4. Under Indexing & Analytics, select Structured Analytics Set, then click, 'New Structured Analytics Set'

5. Enter a name and prefix, choose a saved search as the document set, and then check off "Textual near duplicate identification" as the operation.

6. Set the minimum similarity percentage to 100, then in the 'Destination Textual Near Duplicate Group' select the new field you created.

7. Save and then in the console on the right click, 'Run Structured Analytics'.

8. You'll be give the option to either update all of the documents, or just new documents added to the set.

9. In this example, the summary at the end indicates that 40 exact textual duplicates were found.

10. So in this example we can see that there are three duplicates of CTRL0000001186.0001, which is given as the group ID for all four. A separate Yes/No field, 'X617: Textual Near Duplicate Principal' indicates which document is the anchor for the others.


bottom of page