You enter people’s names into a
table in which misspellings are a common occurrence. You would like a
way to search for a person’s record disregarding slight
differences in spelling. You’ve tried using the
Like
operator with the first letter of the
person’s last name, but that produces too many names. Is there
any way to search for records that sound alike?
Access has no built-in sound-alike function, but you can create one that employs a standard algorithm called the Russell Soundex algorithm. Using this algorithm, it’s fairly easy to search for a last name phonetically.
Run the qrySoundex query found in 06-03.MDB
.
Enter a last name in the query parameter dialog, and qrySoundex will
return all records from tblStaff that sound like the name you
entered. For example, if you enter the name “Jahnsin” at
the parameter prompt, qrySoundex will return the records shown in
Figure 6-5.
To perform Soundex searches in your own applications, follow these steps:
Import the basSoundex module from 06-03.MDB
into
your database.
Create a query based on a table that contains a field that holds people’s last names. Include the LastName field and any additional fields you wish to see in the output of the query.
Create a calculated field that calculates the Soundex code for the LastName field using the acbSoundex function. In qrySoundex, we used the following calculation to create a new field called Soundex:
Soundex: acbSoundex([LastName])
Enter criteria for the calculated field that compare that field against the Soundex code of a user-entered parameter. Use the acbSoundex function to obtain the Soundex code of the parameter. We used the following criteria in qrySoundex:
acbSoundex([Enter Last Name])
This qrySoundex query is shown in Figure 6-6.
Declare the parameter to be of type Text using the Query → Parameters dialog.
Save and run the query.
You can find the
acbSoundex function in basSoundex in
06-03.MDB
. This function takes a last name and
returns a four-digit Soundex code for the name. If you look at the
fourth column in Figure 6-5, you can see that the
Soundex code for all rows is the same. In this case—for names
sounding like “Jahnsin”—the code is
“J525”. Soundex codes always begin with the first letter
of the name followed by three digits ranging between 0 and 6 that
represent the remaining significant consonants in the name.
The acbSoundex function is shown here:
Public Function acbSoundex(ByVal varSurName As Variant) As Variant ' Purpose: ' Takes a surname string and returns a four-digit ' code representing the Russell Soundex code. ' In: ' varSurName: A surname (last name) as a variant ' Out: ' Return value: A four-digit Soundex code as a variant On Error GoTo HandleErr Dim intLength As Integer Dim intCharCount As Integer Dim intSdxCount As Integer Dim intSeparator As Integer Dim intSdxCode As Integer Dim intPrvCode As Integer Dim varChar As Variant Dim varSdx As Variant Const acbcSoundexLength = 4 ' We add "" to take care of a passed Null. intLength = Len(varSurName & "") If intLength > 0 Then intSeparator = 0 ' Keeps track of vowel separators. intPrvCode = 0 ' The code of the previous char. intCharCount = 0 ' Counts the number of input chars. intSdxCount = 0 ' Counts the number of output chars. ' Loop until the Soundex code is of acbcSoundexLength ' or we have run out of characters in the surname. Do Until (intSdxCount = acbcSoundexLength Or intCharCount = intLength) intCharCount = intCharCount + 1 varChar = Mid(varSurName, intCharCount, 1) ' Calculate the code for the current character. Select Case varChar Case "B", "F", "P", "V" intSdxCode = 1 Case "C", "G", "J", "K", "Q", "S", "X", "Z" intSdxCode = 2 Case "D", "T" intSdxCode = 3 Case "L" intSdxCode = 4 Case "M", "N" intSdxCode = 5 Case "R" intSdxCode = 6 Case "A", "E", "I", "O", "U", "Y" intSdxCode = -1 Case Else intSdxCode = -2 End Select ' Special case the first character. If intCharCount = 1 Then varSdx = UCase(varChar) intSdxCount = intSdxCount + 1 intPrvCode = intSdxCode intSeparator = 0 ' If a significant constant and not a repeat without ' a separator, code this character. ElseIf intSdxCode > 0 And _ (intSdxCode <> intPrvCode Or intSeparator = 1) Then varSdx = varSdx & intSdxCode intSdxCount = intSdxCount + 1 intPrvCode = intSdxCode intSeparator = 0 ' If a vowel, this character is not coded, but it ' will act as a separator. ElseIf intSdxCode = -1 Then intSeparator = 1 End If Loop ' If the code is < acbcSoundexLength chars long, fill ' the rest of the code with zeros. If intSdxCount < acbcSoundexLength Then varSdx = varSdx & String((acbcSoundexLength - intSdxCount), "0") End If acbSoundex = varSdx Else acbSoundex = Null End If ExitHere: On Error GoTo 0 Exit Function HandleErr: Select Case Err Case Else MsgBox Err & ": " & Err.Description, vbOKOnly + vbCritical, "acbSoundex" End Select Resume ExitHere End Function
The acbSoundex function is based on the Russell Soundex standard algorithm. Soundex is the most commonly used sound-alike algorithm in the U.S. It works by discarding the most unreliable parts of a name, while retaining much of the name’s discriminating power. It works best when used with the English versions of names of people of European descent. Its discriminating power is reduced when it is used with very short or very long names or names with a high percentage of vowels. Other sound-alike algorithms may work better in these situations.
The Soundex algorithm was created to work with people’s last
names. It appears to work reasonably well with people’s first
names also, but not for names of businesses. Soundex does not work
well for business names primarily because these names tend to be
longer than people’s names, and Soundex encodes only the first
four significant characters. We’ve found that extending the
number of encoded characters to eight works better for business
names, although this is a nonstandard implementation of the
algorithm. You can easily extend the number of encoded characters by
changing the acbcSoundexLength
constant found at
the beginning of acbSoundex. If you decide to do
this, however, we suggest you rename the function to something like
acbSoundex8 to distinguish it from the standard
function.