Record Linkage Software

Maximize the value of your data by using a highly visual software application – rated best-in-class with an accuracy of 96% – that offers an end-to-end solution for cleaning, linking, and deduping datasets to attain the complete, 360-degree view of entities.

Trusted By

Trusted By

Definition

What is record linkage?

Record linkage is the process of comparing records from two or more disparate data sources and identifying whether they refer to the same entity or individual. This process is pretty simple when you have standardized datasets that contain unique identifiers, but it is quite challenging when your datasets do not conform to a standardized format or lack uniquely identifying data attributes.

In such cases, complex rule building is required to determine potential unique identifiers in your datasets, and match records depending on the weight assigned to each identifier. Based on the matching results, records are linked together and verified to check if they belong to the same or a different entity.

Process

How does record linkage work?

Pre-processing

Ensure reliable data quality by performing data cleansing and standardization activities, such as fixing null, misspelled, or invalid data, as well as checking data accuracy and relevancy.

Field comparisons​

Select a combination of fields and calculate the probability of their values being similar by implementing relevant field matching algorithms used for fuzzy, numeric, phonetic, or domain-specific comparisons.

Record deduplication

Configure merge purge rules to overwrite data, remove duplicates, and attain a single, comprehensive view of the entity.

Indexing/Blocking

Implement blocking or indexing techniques that limit the number of comparisons between records and only compares them if they have a high probability of belonging to the same entity.

Classification and evaluation

Classify records as a successful match or non-match based on the match scores calculated for field similarity, and evaluate results with varying levels and weights to attain maximum record linkage accuracy.

Record deduplication

Configure merge purge rules to overwrite data, remove duplicates, and attain a single, comprehensive view of the entity.

Solution

Let Data Ladder handle your record linkage process

See DataMatch Enterprise at work

DataMatch Enterprise is a highly visual and intuitive record linkage software application, specifically designed to solve customer and contact data quality issues.

DataMatch leverages multiple industry-standard and proprietary algorithms to detect phonetic, fuzzy, mis-keyed, and abbreviated variations. The suite allows you to build scalable configurations for data standardization, deduplication, record linkage, enhancement, and enrichment across datasets from multiple sources, such as Excel, text files, SQL, Oracle, ODBC, etc.

Business benefits

How record linkage can benefit you?

Improve customer experience

Get rid of duplicate and bad data records and leverage data to improve the journey and experiences offered to your customers.

Strengthen brand perception

Enhance brand reputation by delivering personalized, data-driven experiences to customers and employees.

Increase operational efficiency

Plan effective utilization of technology, resources, workforce, and business processes by using complete and comprehensive data records.

Eliminate duplicate efforts

Avoid wasting time, effort, and marketing budget on duplicate and unmatched data records.

Gain reliable business insights

Level up your data quality to make informed decisions and determine the next best move for your business.

Build a single source of truth

Build the master record that becomes the single source of truth across the entire organization.

Let’s compare

How accurate is our solution?

In-house implementations have a 10% chance of losing in-house personnel, so over 5 years, half of the in-house implementations lose the core member who ran and understood the matching program.

Detailed tests were completed on 15 different product comparisons with university, government, and private companies (80K to 8M records), and these results were found: (Note: this includes the effect of false positives)

Features of the solutionData LadderIBM Quality StageSAS DatafluxIn-House SolutionsComments
Match Accuracy (Between 40K to 8M record samples)96%91%84%65-85%Multi-threaded, in-memory, no-SQL processing to optimize for speed and accuracy. Speed is important, because the more match iterations you can run, the more accurate your results will be.
Software SpeedVery FastFastFastSlowA metric for ease of use. Here speed indicates time to first result, not necessary full cleansing.
Time to First Result15 Minutes2 Months+2 Months+3 Months+
Purchasing/Licensing Costing80 to 95% Below Competition$370K+$220K+$250K+Includes base license costs.

Frequently asked questions

Got more questions? Check this out

When your datasets have multiple attributes that uniquely identify a record, then comparisons can be performed based on all these columns. This is called deterministic record linkage. Records can be considered a match if they match on a single attribute or any set threshold value. Data attributes such as social security number and national ID are good examples of uniquely identifying attributes which can be used for deterministic record linkage.

When your datasets do not contain exact uniquely identifying attributes, you must leverage fuzzy (or probabilistic) techniques to link records. In this case, multiple attributes are assigned weights and considered together to classify records as matches or non-matches. An example of probabilistic record linkage is using First Name, Last Name, Date of Birth, and Address and assigning them appropriate weights to compute possible matches.

There are multiple challenges encountered while performing record linkage, such as ensuring data quality through data cleansing and standardization, validating results to ensure records are correctly linked together, classifying unclassified records, tuning algorithms to maximize accuracy, and resolving computational complexity.

Different domains and industries use record linkage for various purposes. For example, it is used to perform historical researches in statistical agencies, link and consolidate patient records in healthcare, detect fraud and crime, maintain organizational data quality, implement master data management, or utilize organizational data for business intelligence.

ready? let's go

Try now or get a demo with an expert!

"*" indicates required fields

Choice*
This field is for validation purposes and should be left unchanged.