Predicting the number of top level clusters

In a recent Relativity webinar, it was contended that when performing clustering on a saved search, a generality setting of 0.5 creates 8 top-level clusters in the example data set. The instructor noted that this setting is not guaranteed to generate 8 on all data sets. Generality at 0.9 creates 4 top-level clusters in the same data set.

In a very general way, this seems to borne out in my own Relativity sandbox workspace. Clustering a saved search at 0.5 generality . . .

. . . doesn't create 8 top level clusters in my data set, but it does create six large top level clusters, plus several other top clusters which might be grouped together to form two additional top level clusters of similar size.

A generality setting of .9 . . . .

. .. . won't create 4 top level clusters in my completely different data set:

. . . but it does create four top level clusters clearly larger than the others.

It's a general rule of thumb but not an entirely unuseful one, however my test clusters seem to refute the general rule that high generality settings will lead to fewer top-level clusters.

Contact Me With Your Litigation Support Questions:

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with