Note: This process is the same for both Single-Entity Matching and Across-Entity Matching
This section details how to modify and standardise your data via predefined transforms. This data will then be utilised for matching purposes. Within this section you can modify custom transformations at a session level.
Test Input Phrase: This section lets you test your applied transforms. Each line will show you the corresponding transform output.
In this screenshot, only one transform has been selected which only provides one extra output.
Note: You must input your test phrase on the same line as the group folder.
Transforms
During the matching process – Data Transformations can alter the way that DQ for Dynamics looks at your data (it does not change the actual data). This is very effective for transforming specific data elements purely for the purpose of matching.
For example, in the 'company name' field you may have a record of TrueData Ltd and another record with TrueData Plc. In this case the “business element” (Ltd or Plc) may be considered irrelevant so you only wish to match on the core of the word “TrueData”. You would use transformations to “exclude” the business elements in this case. See individual transformation categories for a more detailed explanation.
NOTE: Users cannot apply transform rules on the multi-select option set.
Custom Exclude
This allows you to work with delimiters stored within the database. Simply insert the left and right delimiter and select your mode. The screen below allows you to configure transform parameters for a ‘Custom Exclude’:
Custom Exclude Modes (options):
- Remove between Delimiters
- Remove Delimiters Only
- Remove Delimiters and between
- Remove Outside the Delimiters
Custom Transform
This will search for a customised word or phrase and replace it with a custom string. The screen below allows you to configure the transform parameters for ‘Custom Transform’:
Custom Transform Library
This is a flexible feature which allows you to bespoke/tailor a match to cater for the unique nature of each user's data. The screen below allows you to configure transform parameters for ‘Custom Transform Library’. Choose from the 'Category' dropdown list of custom transformations already loaded into the App (your own custom transformations will be visible here once added).
NOTE: Custom Transforms can be configured by navigating to the ‘Custom Transform Library Configuration’ screens in 'Advanced Settings' option under the 'Settings' menu.
Extract Characters
This is used to eliminate one or more characters from left or right of the data. The screen below allows you to configure transform parameters for ‘Extract Character(s)’:
Extract Word
This is used to eliminate a particular word from the string information of the data. The screen below allows you to configure transform parameters for ‘Extract Word’:
Remove Characters
This is used to eliminate all Vowels, Consonants, Numbers, Punctuation and Other Characters from the data. The screen below allows you to configure transform parameters for ‘Remove Characters’:
Mode:
- Vowels: A E I O U
- Consonants: B C D F G H J K L M N P Q R S T V W X Y Z
- Numbers: 0 1 2 3 4 5 6 7 8 9
- Punctuation: , . : ; - '
- Other Characters: ` ! £ $ % ^ & * ( ) _ + = [ { ] } @ # ~ < > ? / | \
Split String
This transform will allow you to split a string based on a custom delimiter, you have to option to return the result with or without the specified delimiter.
Transform Words
This is used to Normalise, Exclude, Elaborate or Abbreviate standard elements from your data. The screen below allows you to configure transform parameters for ‘Transform Words’:
Abbreviate:
Category | Example |
---|---|
Addressing | 'Road to 'Rd', 'Avenue' to 'Ave' |
Business | 'Limited' to 'Ltd', 'Company' to 'Co' |
Countries | 'United Kingdom' to 'UK' |
DateEvents | 'January' to 'Jan' |
JobTitles | 'Manager' to 'Mgr', 'Colonel' to 'Col' |
Numbers | 'Twenty' to '20', 'Nine' to '9' |
Qualifications | 'Bachelor of Science' to 'BSc' |
Salutations | 'Doctor' to 'Dr', 'Mister' to 'Mr' |
WeightsMeasures | 'Ounces' to 'Oz' |
Miscellaneous | 'Object' to 'Obj' |
Forenames | 'Robert' to 'Bob', 'Antony' to 'Tony' |
Elaborate:
Category | Example |
---|---|
Addressing | 'Rd to 'Road', 'Ave' to 'Avenue' |
Business | 'Ltd' to 'Limited', 'Co' to 'Company' |
Countries | 'UK' to 'United Kingdom' |
DateEvents | 'Jan' to 'January' |
JobTitles | 'Mgr' to 'Manager', 'Col' to 'Colonel' |
Numbers | '20' to 'Twenty', '9' to 'Nine' |
Qualifications | 'BSc' to 'Bachelor of Science' |
Salutations | 'Dr' to 'Doctor', 'Mr' to 'Mister' |
WeightsMeasures | 'Ounces' to 'Oz' |
Miscellaneous | 'Obj' to 'Object' |
Forenames | 'Bob' to 'Robert', 'Tony' to 'Antony' |
Exclude:
Category | Example |
---|---|
Addressing | Exclude text such as 'Road“ and 'Rd' |
Business | Exclude text such as 'Ltd' and 'Limited' |
Countries | Exclude text such as 'UK' and 'USA' |
DateEvents | Exclude text such as 'Mon' and 'January' |
JobTitles | Exclude text such as 'Mgr' and 'Manager' |
Numbers | Exclude text such as '100' and 'Hundred' |
Qualifications | Exclude text such as 'BA' and 'BSc' |
Salutations | Exclude text such as 'Mr' and 'Dr' |
WeightsMeasures | Exclude text such as 'Oz' and 'Ounces' |
Miscellaneous | Exclude text such as 'Obj' and 'Object' |
Forenames | Exclude text such as 'Andi' and 'Robert' |
Normalise:
Category | Example |
---|---|
Addressing | 'Garden' to 'Gardens', 'Gdns' to GND' |
Business | 'Company', 'Comp' to 'CO' |
Countries | 'United Kingdom', 'Great Britain', 'GBR' to 'GB' |
DateEvents | 'January' to 'Jan', 'Monday' to 'Mon' |
JobTitles | 'Engineer', 'Engr' to 'ENG' |
Numbers | 'Nought', 'Null', 'Nil' to '0' |
Qualifications | 'Dr of Philosophy', 'DPhil' to 'PhD' |
Salutations | 'Mrs', 'Ms', 'Madam' to 'MRS' |
WeightsMeasures | 'Inches', 'Inch', 'Ins' to 'IN' |
Miscellaneous | 'Cheque', 'Check' to 'Chq' |
Forenames | 'Andrew', 'Andrea', 'Andres' to 'Andi' |
For a detailed overview of our data transformations, please see our Transform Guide.
Trim String
This will eliminate any trailing spaces at the beginning and the end of a string. There is no screen to configure the transform parameters for ‘Trim String’. You can directly drag and drop the ‘Trim String’ rule for any group and it can be viewed as shown below:
Match Key
Once the data has been transformed for the purpose of matching, it can then be tokenised using a fuzzy matching algorithm. This can be applied by selecting the ‘Match Key’ drop-down. The 'Match Key' is used to select an algorithm for phonetic match token generation.
The 'Match Key' drop-down will have six choices:
Soundex
Soundex retains the first letter of the input string to formulate its match token. Soundex removes vowels (a, e, i, o, u) and h and w from the input string. The remaining letters are assigned numbers using a lookup table to produce a token of 4 characters.
This means ‘Cathy’ and ‘Kathy’ will not match as their match tokens begin with a ‘C’ from Cathy and a ‘K’ from Kathy. As such, Soundex does not match well where the start of a word sounds the same but is not the same. Also, due to the numeric substitution it is possible to be shown non-matches (false positive) matches.
DQSoundex
DQSoundex overloads Soundex with the advanced capabilities of DQFonetix™. This improves the start of word logic and modifies the first letter(s) of an input string. DQSoundex will de-pluralise and pre-process the start of words to manage variances like ‘C’ to ‘K’ as in 'Cathy' and 'Kathy', as well as ‘Ph’ as in Phonetix to ‘F’ in Fonetix.
Metaphone
Metaphone improves the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation, to produce a more accurate encoding.
This allows you to find more precise matches than the simple Soundex algorithm. Metaphone considers a larger set of character transformations than Soundex and therefore analyses a string phonetically with far more accuracy.
DQMetaphone
DQMetaphone like DQSoundex is an enhanced Metaphone technology with the advanced capabilities of DQFonetix™. This improves the start of word logic and modifies the first letter(s) of an input string. DQSoundex will de-pluralise and pre-process the start of words to manage variances and improve matching.
In the case shown below (Christopher), Metaphone would have generated three of five names matches. However, after running DQ’s advanced algorithms and advanced logic, DQMetaphone allows ‘Kh’ from 'Khristopher' to match with the ‘Ch’ from 'Christopher'. Thus generating the same match key token.
DQFonetix™
DQFonetix™ contains our advanced phonetic algorithms developed by DQ Global over the last 25 years. The algorithm is property of DQ Global and hence we do not share the specification of the process. However, DQFonetix™ has four key features:
- Five spoken languages – English, Spanish, French, Italian and German
- Avoids false matches
- Overcomes character variances
- Deals with diacritics
DQFonetix™ provides your CRM system with the most varied matching window allowing you to highlight duplicate matches that may not be picked up – or falsely matched - in Soundex and Metaphone.
No Match Key
Selecting no match key will not generate a phonetic token, hence no match token will be generated. However, this allows you to match identical strings.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article