Soundex
top of page

Soundex


Soundex is a phonetic algorithm widely used to associate different names that are pronounced similarly. The Soundex encoding method was developed almost a century ago and is a standard feature of many current database and electronic discovery related programs, such as the one profiled in last night's tip, Windows Grep.

The Soundex encoding method reduces any name to a three digit code preceded by a letter.

1. The first letter of the name is retained.

2. All references to h, w, a, e, i, o, u and y are dropped unless they are the first letter.

3.

b, f, p, v become 1.

c, g, j, k, q, s, x, z become 2.

d, t become 3.

l becomes 4.

m, n become 5.

r becomes 6.

4. If there are two or more letters with the same number in a name (or if two such letters are separated by an h or w), only the first is retained, even if one of the letters is the first one in the name.

5. All Soundex codes must contain a letter and three digits. If there are not enough letters in a name for there to be three digits, add zeroes. Only review letters from the start of any name until you get a letter and three digits.

So using this method the name, William becomes:

W450

Smith becomes:

S530

Brown or Braun become:

B650

Soundex is frequently employed by sites devoted to researching familiy genealogies. This site, https://www.ics.uci.edu/~dan/genealogy/Miller/javascrp/soundex.htm will generate Soundex codes for any name you enter.

Soundex is particulary useful when searching for a name in a database that may have different spellings that are hard to predict. So while you know you're looking for someone named Ismail, you may not have any idea how others will spell his name. Choose the Soundex option in Windows Grep . . .

. . . and it will automatically pull up all words with the same Soundex encoding as Ismail - I254


bottom of page