top of page

NotePad ++ includes a very handy feature that you can use to quickly copy out on the results of only the searches you have run in a text file.



In the Find tool in NotePad ++, you'll see that you have the option on the Mark tab to 'Bookmark line'. Checking off this box will give you the option to mark any line on which your search result appears. Click Mark All, and then run the search:



. . . you can run additional searches, and the previous search will stay marked, unless you first click on 'Clear all marks'. NotePad++ will add a blue dot next to the lines that contain any search hit.


To copy out only the lines which have a marked search result go to Search . . . Bookmark . . . Copy Bookmarked Lines.



You will then be able to simply copy out the lines with hits to a new file.



Regular expression searches can be designed to find both a specific alphanumeric pattern, and a given number of words before and after that pattern.


In this example we use the following regex pattern to look for dates in a text file:


(effective as of )(Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+(\d{1,2})\,\s+(\d{4})


See the Tip of the Night for June 5, 2020 for an explanation of how this regex pattern works. The search finds dates preceded by the phrase, 'effective as of':



It would be helpful to edit the regex search so it collects the data for each SEC form filing, including the description, form number, and filing date / period end date. We can modify it this way so the search includes six words before the searched for phrase and date, and five words afterwards:


((?:\S+\s*){0,6}effective as of )(Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+(\d{1,2})\,\s+(\d{4}(?:\S+\s*){0,5})


The regular expression syntax added to the beginning and end, searches for both whitespace '\s' and non-whitespace '\S'. The second number in the curly brackets sets the number of words or digits before or after the regex pattern in between that is to be matched.









ASCII text can contain non-printable characters which will not be visible when you open a text file in NotePad. The non-printable characters include not only line feeds and carriage returns, but also characters such as those used to mark the end of text; end of transmission; tabs; and file separators.



To review these characters, enter the text in an advanced text editor such as NotePad ++.



Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page