Follow me on Twitter and see How-To Videos on my YouTube channel.
New tips for paralegals and litigation support profesionals are posted to this site each night. Click on the blog headings for better detail.
The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information. This policy is subject to change at any time. The owner is not an attorney, and nothing posted on this site should be construed as legal advice. Litigation Support Tip of the Night does not provide confirmation that any e-discovery technique or conduct is compliant with legal, regulatory, contractual or ethical requirements.
When considering document review platforms and their conceptual searching capabilities, inquire as to whether or not they can account for the Lemmatization of words.
The lemma of a word is its dictionary form. So the word ‘go’ is the lemma for ‘going’, ‘went’, ‘gone’ - the various tenses of the ‘headword’, ‘go’. The multiple inflections are collectively known as the lexeme of the word. Lemmatization differs from stemming in that it considers the context in which a word is used.
Stemming will not find ‘better’ which is part of the lexeme of the lemma, ‘good’. Generally stemming facilitates the recall of a search - that percentage of available responsive hits in a review set that are returned. Employing search algorithms which account for Lemmatization will improve the precision of searches - the percentage of true hits as opposed to false positives.
A stemming search algorithm may use a stem of the word, ‘crazy‘, spelled as ‘crazi’ to account for craziness.