Match Group Transforms

Modified on Thu, 4 Jun at 8:25 PM

Guide contents:


This section will allow you to modify and standardise your data via predefined transforms. 

This data will then be utilised for matching purpose. 

After clicking on ‘Match Group Transforms’ under 'Settings', you will navigate to the screen below:

Rules: On the left-hand side, a pre-defined list of 'Match Rules' will appear in alphabetical order (ascending-by default). You can drag and drop transform(s) to configure against any Match Group(s). After you drag and drop a transform, a pop-up will appear for you to configure the parameters used within each of the different transforms:

You can enter/select the parameters and then click the ‘Add’ button to add the transform to the selected Match Group.


Match Group Rules

Match Groups: By default, groups will appear in alphabetical order (ascending). You can re-name a Match Group by a double click. It is mandatory to create group(s) as prerequisite and then only transforms can be configured against group(s). You can drag and drop the transform(s) in upward and downward direction (must be dropped on top of another transform or group) to change the order.

Parameters: Displays the parameter description for the transform selected. On double click; a pop-up will appear to edit the transform parameters.

Test Input Phrase: Here, you can test the effect of a transform. You can enter the 'Test Input Phrase' at Match Group level. The results will be returned in the ‘Test Result’ column as each individual transform is applied. The ‘Test Input Phrase’ and ‘Test Result’ will not be saved in database.

Test Result: For the submitted phrase, the final result will appear at group level and a sequential, step by step result will appear at transform level as shown below:

Create Group: You can only create new groups on the Match Group section. Without the Match Group, there is no group available to apply data transformations. Please see Match Groups for more information on creating a new Match Group.


Transforms

During the matching process – data transformations can alter the way that DQ for Dynamics/Workbooks looks and searches within your data. It is important to note that the actual data does not change, and this is merely a process to give the system a broader field to match duplicates. 

This is very effective for transforming specific data elements purely for the purpose of matching. For example, in the 'Company Name' field, you may have a record of TrueData Ltd and another record entered as TrueData Plc. In this case the “business element” (Ltd or Plc) may be considered irrelevant, meaning you only wish to match records on the core of the word “TrueData”. You would use transformations to “exclude” these elements in this case. See individual transformation categories for a more detailed explanation.

NOTE: Users cannot apply transform rules on the multi-select option set.

Alpha Sequence Transform

This allows you to extract. reorder, standardise, or analyse the alphabetical portions of a field while ignoring numbers, symbols, or formatting characters. 


Examples:

Transform:Before:After:
Ascending CharactersCAB321123ABC
Ascending WordsData Quality GlobalData Global Quality
Descending CharactersABC123CBA321
Descending WordsData Quality GlobalQuality Global Data


Casing

This allows you to convert all casing as required. Either - Lower Case, Title Case or Upper Case.



Examples:


Transform:Before:After:
Lower CaseDQ GLOBALdq global
Title CaseiPhoneIphone
Upper Casepo13 9fuPO13 9FU


Custom Exclude

This allows you to work with delimiters stored within the database. Simply insert the left and right delimiter and select your mode. The screen below allows you to configure transform parameters for a ‘Custom Exclude’ showing the Modes available:

Examples: 

Transform:Before:After:
Remove Business DelimitersDQ Global [DO NOT USE]DQ Global []
Remove Delimiters OnlyDQ Global (OLD)DQ Global OLD
Remove Delimiters and Between DQ Global [DO NOT USE]DQ Global
Remove Outside the DelimitersDQ GLOBAL <1234>1234


Custom Transform

This will search for a customised word or phrase and replace it with a custom string. The screen below allows you to configure the transform parameters for ‘Custom Transform’:

Examples: 

Transform:
Before:After:
Look For:Ltd

Change To:LimitedDQ LtdDQ Limited


Custom Transform Library

This is a flexible feature which allows you to bespoke/tailor a match to cater for the unique nature of each User's data. You can configure transform parameters for 'Custom Transform Library’, choosing from the 'Category' dropdown list of custom transformations already loaded into the App or you can create and choose from your own custom transformations once added.

You can add your own custom transformations as Categories in the Advanced Settings.

Examples:

Extract Characters

This is used to extract a set number of characters from the left or right end of the data based on the position, length, pattern or character type. 

You can choose to extract a set number of characters, left or right on the string. The screen below allows you to configure transform parameters for ‘Extract Characters’:

Examples:

Select Value:No. of Letters:Before:After:
Left2UK-000123UK
Right4ACC-12341234
Left5N.A - RetiredN.A -


Extract Words

This is used to eliminate whole word(s) from left or right end of the data, based on its position. Choose to extract a set number of words, left or right on the string. The screen below allows you to configure transform parameters for ‘Extract Word’:

Examples:

Select Value:No. of Words:Before:After:
Left2Fareham Innovation CentreCentre
Right1Fareham Innovation CentreFareham Innovation


Remove Characters

This is used to eliminate all Vowels, Consonants, Numbers, Punctuation and Other Characters from the data. The screen below allows you to configure transform parameters for ‘Remove Characters’:

Mode:

  1. Vowels: A E I O U
  2. Consonants: B C D F G H J K L M N P Q R S T V W X Y Z
  3. Numbers: 0 1 2 3 4 5 6 7 8 9
  4. Punctuation: , . : ; - '
  5. Other Characters: ` ! £ $ % ^ & * ( ) _ + = [ { ] } @ # ~ < > ? / | \

Examples:

Mode:Before:After:
VowelsDQ GLOBALDQ GLBL
ConsonantsDQ GLOBALOA
NumbersDQ GLOBAL123DQ GLOBAL
PunctuationDQ-GLOBAL.DQ GLOBAL
SpaceDQ GLOBALDQGLOBAL
Other CharactersA£B$C%ABC


Remove Diacritics

This is used to replace accented or non-standard characters with their plain Latin equivalents within the data. 

Before:After:
ée
öo
ñn
çc
øo
áa


Examples:


Before:After:
Contact MatchingJosé GonzálezJose Gonzalez
Address StandardisationSão PauloSao Paulo


Split String

This transform will allow you to divide a single text value string into multiple parts based on a custom delimiter, position or separator. You also have to option to return the result with or without the specified delimiter, being left or right. 


Examples:

Desired Outcome:DelimiterBefore Split:String 1:String 3:
Name SplittingSpaceDQ GlobalDQGlobal
Domain Extraction@sales@dqglobal.comsalesdqglobal.com
Product Grouping-Customer No-1.1Customer No1.1
Geography,London, GBLondonGB
Phone Extensionx02392 988303 x123402392 9883031234


Transform Words

This is used to Normalise, Exclude, Elaborate or Abbreviate standard elements from your data. The screen below allows you to configure transform parameters for ‘Transform Words’:


For a detailed overview of our data transformations, please see our Transform Guide.

Trim String

This will eliminate any trailing spaces at the beginning and the end of a string. There is no screen to configure the transform parameters for ‘Trim String’. You can directly drag and drop the ‘Trim String’ rule for any group. 

Examples:

Before:After:
 DQ GlobalDQ Global
sales@dqglobal.com .sales@dqglobal.com.



Match Key

Once the data has been transformed (purely for the purpose of matching) it can then be tokenised by a fuzzy matching algorithm. This can be applied by selecting the ‘Match Key’ drop-down. The 'Match Key' is used to select any of the out-of-the-box algorithms for phonetic match token generation - all have their uses, however, we recommend using our DQFonetix algorithm to give the best results.


The 'Match Key' drop-down will have six choices:

Soundex

Soundex retains the first letter of the input string to formulate its match token. Soundex removes vowels (a, e, i, o, u) and h and w from the input string. The remaining letters are assigned numbers using a lookup table to produce a token of 4 characters.

This means ‘Cathy’ and ‘Kathy’ will not match as their match tokens begin with a ‘C’ from Cathy and a ‘K’ from Kathy. As such, Soundex does not match well where the start of a word sounds the same but is not the same. Also, due to the numeric substitution, it is possible to show non-matches or false positive matches.

DQSoundex

DQSoundex overloads Soundex with the advanced capabilities of DQFonetix™. This improves the start of word logic and modifies the first letter(s) of an input string. DQSoundex will de-pluralise and pre-process the start of words to manage variances like ‘C’ to ‘K’ as in 'Cathy' and 'Kathy', as well as ‘Ph’ as in Phonetix to ‘F’ in Fonetix.

Metaphone

Metaphone improves the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation, to produce a more accurate encoding.

This allows you to find more precise matches than the simple Soundex algorithm. Metaphone considers a larger set of character transformations than Soundex and therefore analyses a string phonetically with far more accuracy.

DQMetaphone

DQMetaphone like DQSoundex is an enhanced Metaphone technology with the advanced capabilities of DQFonetix™. This improves the start of word logic and modifies the first letter(s) of an input string. DQSoundex will de-pluralise and pre-process the start of words to manage variances and improve matching.

In the case shown below (Christopher), Metaphone would have generated three of five names matches. However, after running DQ’s advanced algorithms and advanced logic, DQMetaphone allows ‘Kh’ from 'Khristopher' to match with the ‘Ch’ from 'Christopher'. Thus generating the same match key token.

DQFonetix™

DQFonetix™ contains our advanced phonetic algorithms developed over the last 25 years by DQ Global. The algorithm is property of DQ Global, hence we do not share the specification of the process. However, DQPhonetix™ has four key features:

  • Five spoken languages – English, Spanish, French, Italian and German
  • Avoids false matches
  • Overcomes character variances
  • Deals with diacritics

DQPhonetix™ provides your CRM system with the most varied matching window to highlight duplicate matches that may not be picked up – or falsely matches - in Soundex and Metaphone.

No Match Key

Selecting no match key will not generate a phonetic token, hence no match token will be generated. However, this allows you to match identical strings.


Miscellaneous Functions

These functions will allow you to navigate the DQ for Dynamics/Workbooks application. They are as follows:

  1. Expand All - Expand all groups to see attributes, transforms, display attributes etc.
  2. Add Group - Add a Match Group, Transform Group, Display Group.
  3. Delete - Delete a rule set or multiple rule sets
  4. Refresh - Refresh the rules displayed on the screen displayed

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article