Class DaitchMokotoffSoundex
- All Implemented Interfaces:
- Encoder,- StringEncoder
The Daitch-Mokotoff Soundex algorithm is a refinement of the Russel and American Soundex algorithms, yielding greater accuracy in matching especially Slavish and Yiddish surnames with similar pronunciation but differences in spelling.
The main differences compared to the other Soundex variants are:
- coded names are 6 digits long
- the initial character of the name is coded
- rules to encoded multi-character n-grams
- multiple possible encodings for the same name (branching)
This implementation supports branching, depending on the used method:
- encode(String)- branching disabled, only the first code will be returned
- soundex(String)- branching enabled, all codes will be returned, separated by '|'
 Note: This implementation has additional branching rules compared to the original description of the algorithm. The
 rules can be customized by overriding the default rules contained in the resource file
 org/apache/commons/codec/language/dmrules.txt.
 
This class is thread-safe.
- Since:
- 1.10
- See Also:
- 
Constructor SummaryConstructorsConstructorDescriptionCreates a new instance with ASCII-folding enabled.DaitchMokotoffSoundex(boolean folding) Creates a new instance.
- 
Method SummaryModifier and TypeMethodDescriptionEncodes an Object using the Daitch-Mokotoff Soundex algorithm without branching.Encodes a String using the Daitch-Mokotoff Soundex algorithm without branching.Encodes a String using the Daitch-Mokotoff Soundex algorithm with branching.
- 
Constructor Details- 
DaitchMokotoffSoundexpublic DaitchMokotoffSoundex()Creates a new instance with ASCII-folding enabled.
- 
DaitchMokotoffSoundexCreates a new instance.With ASCII-folding enabled, certain accented characters will be transformed to equivalent ASCII characters, for example è -> e. - Parameters:
- folding- if ASCII-folding shall be performed before encoding
 
 
- 
- 
Method Details- 
encodeEncodes an Object using the Daitch-Mokotoff Soundex algorithm without branching.This method is provided in order to satisfy the requirements of the Encoder interface, and will throw an EncoderException if the supplied object is not of type String.- Specified by:
- encodein interface- Encoder
- Parameters:
- obj- Object to encode.
- Returns:
- An object (of type String) containing the DM Soundex code, which corresponds to the String supplied.
- Throws:
- EncoderException- if the parameter supplied is not of type- String.
- IllegalArgumentException- if a character is not mapped.
- See Also:
 
- 
encodeEncodes a String using the Daitch-Mokotoff Soundex algorithm without branching.- Specified by:
- encodein interface- StringEncoder
- Parameters:
- source- A String object to encode.
- Returns:
- A DM Soundex code corresponding to the String supplied.
- Throws:
- IllegalArgumentException- if a character is not mapped.
- See Also:
 
- 
soundexEncodes a String using the Daitch-Mokotoff Soundex algorithm with branching.In case a string is encoded into multiple codes (see branching rules), the result will contain all codes, separated by '|'. Example: the name "AUERBACH" is encoded as both - 097400
- 097500
 Thus the result will be "097400|097500". - Parameters:
- source- A String object to encode.
- Returns:
- A string containing a set of DM Soundex codes corresponding to the String supplied.
- Throws:
- IllegalArgumentException- if a character is not mapped.
 
 
-