top of page

You can use this Regular Expression search to find the first instance where two words repeat one after the other in any text:


(\b\w+\b)(?=.*\b\1\b)



ree

 
 

A simple regular expression:


<.*?>

. . . will find any characters between the less than and greater than symbols or any other 2 characters placed before and after ’.*?’


As shown in this example, this search can be used to select (or remove) the display names in a list of email addresses.



ree



 
 

Searching for sentences using multiple consecutive words in all capital letters can be an effective way to track down email or other documents in which employees express surprise, anger, excitement or other strong emotion. The following regular expression search will locate any instances where at least four words appear in ALL CAPS in a sentence:


(\b[A-Z]{2,}\s[A-Z]{2,}\b\s[A-Z]{2,}\b\s[A-Z]{2,}\b)


[A-Z] looks for any of the 26 capital letters, and {2,} finds instances in which words of 2 or more characters are in all capital letters. \b searches for a word boundary, and \s searches for a whitespace. The sequence \b[A-Z]{2,}\s can be repeated as many times as necessary to capture longer sequences of fully capitalized words.


See this example of how the search is run in NotePad++


ree



 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page