Removing duplicate values in NotePad ++

Removing duplicate values in NotePad ++

October 19, 2015

If you're working with a sufficiently large amount of data in Excel you may find that basic options like 'Remove Duplicates' won't function . . . or will at least work very slowly.   If you need to de-dupe a column of data, (perhaps just to look for irregularities in standardized data, when the filter list won't show all of the entries), you can paste it in NotePad ++ and dedupe it.

 

1. Paste the data in NotePad++.   You can sort the data, clicking CTRL + A, and by going to Edit . . . Line Operations . . . Sort Lines Lexicographically Ascending, but this step is not necessary for the data to be deduplicated.

 

2.  Press CTRL + H and enter this Regex search in the Find box:

 

^(.*?)$\s+?^(?=.*^\1$)

 

  . . . leave the replace box empty and make sure the Regex search mode is selected.

 

3. Click replace all and you'll quickly have a deduped list. 

 

See the explanation of this Regex search on this web page: http://stackoverflow.com/questions/3958350/removing-duplicate-rows-in-notepad

 

 

 

 

 

 

 

Please reload

Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com