Project 8: Soundex

due at midnight on   +60

The Soundex algorithm is used in historical research to match up names with different spellings that are (likely to be) the same original source. For any name, the Soundex is one initial letter followed by three digits. For example, the soundex of “Robert” is R163 and the soundex of “Rupert” is also R163. That way, the names can be filed together under the same heading, or search queries can match alternate spellings.

Here is the algorithm, in pseudo-code:

  1. Save the first letter.

  2. Remove all occurrences of ‘h’ and ‘w’ except first letter.

  3. Replace all consonants (include the first letter) with digits as follows:
    • b, f, p, v → 1
    • c, g, j, k, q, s, x, z → 2
    • d, t → 3
    • l → 4
    • m, n → 5
    • r → 6
  4. Replace all adjacent duplicate digits with one digit. (55 becomes just 5.)

  5. Remove all occurrences of a, e, i, o, u, y, except first letter.

  6. If first symbol is a digit replace it with letter saved in step 1.

  7. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than three numbers, just retain the first three.

You should implement this using C++ string operations, so you are actually modifying the string to build the soundex, and not just outputting the soundex. Below are a bunch of sample runs you can use as test cases:

Enter a name: Robert
Soundex: R163
Enter a name: Rupert
Soundex: R163
Enter a name: Rubin
Soundex: R150
Enter a name: Ruben
Soundex: R150
Enter a name: Reuvain
Soundex: R150
Enter a name: Wulff
Soundex: W410
Enter a name: Wolf
Soundex: W410
Enter a name: Wolfe
Soundex: W410
Enter a name: Checkov
Soundex: C210
Enter a name: Chekhoff
Soundex: C210
Enter a name: Muscowitz
Soundex: M232
Enter a name: Mouskowits
Soundex: M232
Enter a name: Nahasapeemapetalan
Soundex: N215
Enter a name: Nehisi
Soundex: N200

During development of my solution, I added some extra cout statements to print the result after each step in the algorithm:

Enter a name: Rholff
Step 2: Rolff    // Removed h/w
Step 3: 6o411    // Consonants to digits
Step 4: 6o41     // Remove duplicate digits
Step 5: 641      // Remove vowels
Step 6: R41      // Put back first letter
Soundex: R410    // Pad with zero

Name your program p08soundex.cpp and submit it to this dropbox for project 8.