Matching Methodology and Strategies

There are two API options for comparing business data to the Markaaz Global Business Directory, Advanced Match and Search. Advanced Match offers options for additional metadata to be returned in addition to multiple "matches" based on how many entities are requested. The "best match", the one with the highest match confidence score, is always returned as the first entity in the response. Additional information on the Advanced Match API can be found in the section on integration and in the API Reference. The Search API offers the ability for your organization to search the Markaaz Global Business Directory with nominal or complete requests of input data, such as name and country, to find businesses that meet your search criteria. Search also offers options for controls around how many businesses are returned in the search and the matching criteria that is used. Additional information on the Search API can be found in the section on integration and in the API Reference.

Matching Methodology

Standardization of Data (Structure, Business Name, Address and other identifying data):

Company Name Formatting and Standardization—Global and Regional views of Company Forms, Address Formatting and parsing of data into proper fields to maximize the effectiveness of the Matching Engine.
Leveraging Multiple Addresses of Companies—Registered Address, Physical Address, Mailing Address and Former Locations.
Evaluation of Match Candidates to Identify Best Match—Each identifying company attribute is scored to determine the best match candidate based on the input record submitted. This process narrows down the match candidate pool. The best match candidate is identified from the assessment of the grades of each attribute used for matching.

Matching Strategies

Many different business attributes are used to identify the best candidates based on the input record submitted.

Identifying Attributes—Name (Including Legal Name, AKA/DBA and Former Names), Address (Including Registered, Physical, Mailing and Former Addresses)
Location Attributes—Phone Number, URL
National IDs and Other IDs—National ID (ex. VAT), TaxID (FEIN, EIN, TaxID), LEI

Markaaz Data Matching Algorithm

Clean and normalize input data as a method to improve the accuracy and speed of the algorithm:
- As a part of this step, we perform the following operations on both the original string and string to be compared. Some examples include:
  - Lowercase all the letters.
  - Remove any special characters.
  - Remove any additional white space.
  - Replace the words with abbreviations with original words.
  - Replace accented characters with their base forms.
Check common tokens:
- This step will set the flag hasCommonTokens; if both strings have at least one common full text match.
  - For example, if we are comparing two strings (string1 & string2):
    - string1 = ABC Manufacturing Company & string2 = ABC Company will set the flag as true.
    - The tokens for string1 are: ["ABC", "Manufacturing", "Company"]
    - And the tokens for string2 are: ["ABC", "Company"]
    - string1 = ABC Manufacturing Company & string2 = XYZ Ltd will set the flag as false as there are no matching tokens.
Calculate Match Grade:
- The scoring algorithm will be configurable and can be updated if needed.
- We are using Sørensen-Dice Coefficient string similarity algorithm for calculating the Match Grade.
- The threshold score for each grade will be configurable and can be updated as per need.

Match Score/Grade Output

The output will include a Match Grade at candidate/entity level.
Each input field will be provided a grade against each field in response payload. The Advanced Match API will return those field grades if the option is enabled.
Entities that match more parameters will have a higher match confidence and a better chance of being discovered in the directory.