Using Record Linkage to Resolve Patient Matching Errors

“Accurate identification of patients is one of the most difficult operational issues during a public health emergency, and the nationwide response to the pandemic, including the rollout of the vaccination programs, has highlighted the repercussions of not having a nationwide strategy to connect patients with their data,” 

– AHIMA CEO Wylecia Wiggs Harris 


For healthcare providers, clean and reliable data can determine the difference between a successful patient diagnosis or death due to incorrect drug prescription and treatment. And yet, one of the primary causes of medical errors that amount to more than 250,000 deaths every year in the US is insufficient access to patient information

The lack of proper insight into patient health status, medical history, test results, etc. uncovers alarming data quality challenges within the healthcare industry, some of which include:  

  • Nearly 18% of patient EHRs in 2019 were identified as duplicates  
  • 38% of healthcare providers in the US experienced an adverse event due to patient matching problem 
  • Duplicate records cost the average hospital approximately $1.5 million  

As a solution, a medical record linkage software can play an instrumental role in accurately linking disconnected data sources and avoiding unresolved patient identities stemming from lack of unique identifiers and duplicate records.  

Let’s explore how this can be the case. 

Challenges of Patient Matching

Patient Matching Challenges

According to HealthIT, patient matching is defined as: 

‘The identification and linking of one patient’s data within and across health systems in order to obtain a comprehensive view of that patient’s health care record’. 

Patient matching is usually achieved by linking together multiple patient data fields such as name, phone number, address, and birth date. However, various challenges can limit the seamless linkage of medical data such as:  

  • Master Patient Index errors: using identifiers such as patient name and date of birth, a master patient index (MPI) is created for each patient to store and link all medical data across various administrative and clinical systems. However, when a patient data cannot be found in a particular MPI, a new medical identity number is created. As a result, this heightens the risk of duplicate records and data silos, creating barriers in linking longitudinal patient data.  
  • Duplicate records: repeated records and information occur due to various formatting and spelling errors. A journal published from the AHIMA foundation that analyzed nearly 400,000 patient records found that two of the most common field discrepancies in creating a single patient view were middle-name (over 58%) and Social Security Number (approximately 54%). Researchers further noted that these mismatches resulted from spelling errors (nearly 53%) and name reversals (nearly 34%).  
  • Limited interoperability: interoperability is defined as the ease of exchanging data across multiple data devices and systems. However, inconsistent data standards and formats arising from a lack of standardization can undermine interoperability. The lack of unique identifiers means that patient demographic data must be relied on as a secondary basis for matching criteria, but due to the variability of address standards and formats, patient matching is often inefficient. In fact, a 2019 American Medical Informatics Association study found that address and last name standardization helped improve patient matching sensitivity from 81.3% to 91.6% for health information exchange (HIE) datasets.  

Importance of Record Linkage Software for Healthcare

Record linkage is the process of linking and comparing records from two or more disparate sources and determining whether they refer to the same entity or not. This not only includes identifying seemingly different records that could be duplicates, but also identifying otherwise similar records that are different entities altogether.  

In the context of healthcare industry, medical record linkage is crucial for resolving the issues pertaining to patient matching across various EHR and claims databases and patient registries using unique identifiers. Doing so can help healthcare providers benefit from the following:  

  • Better diagnosis and healthcare treatment: accurate patient matching can ensure medical staff have sufficient access to a patient’s medical history including previous treatments and medications taken to diagnose treatments and prescribe drugs. 
  • Enhanced interoperability: consistent data standards and formats along with single unique identifier can lead to higher interoperability and better data sharing among key stakeholders. 
  • Lower patient waiting times: automated data cleansing, matching and deduplication can minimize significant delays in resolving patient identities and accelerate time-to-treatment for critical patients.  
  • Cost savings: Lack of duplicate medical records and inconsistencies can help avoid unnecessary treatment equipment and medical staff expenses.   

As important record linkage is for healthcare, few healthcare providers have managed to resolve the challenges of patient matching. This is mainly due the reliance on legacy solutions of manually consolidating large datasets, running scripts to identify and resolve data errors, and creating a unique identifier that can be consistently applied across millions of records.  

Using a dedicated record linkage software, on the other hand, can help healthcare providers benefit from the scalability of data consolidation and matching processes, as well as automation to accelerate time-to-treatment and lower waiting times. 

Record Linkage using DataMatch Enterprise

DataMatch Enterprise is Data Ladder’s record linkage software solution to enable accurate linking of disparate data sources and run various data quality processes to achieve clean and reliable data. Unlike manual processes or specialized tools, DataMatch Enterprise offers an all-in-one data quality and matching engine that is capable of addressing a wide variety of data quality problems from misspellings and varied formats to reconciled entities and duplicates.  

Record Linkage Methods To Improve Data Quality In Healthcare 

  • Data import of disparate sources: DataMatch Enterprise is capable of ingesting medical records in various types of databases and sources such as SQL Server, MySQL, PostgreSQL, MongoDB, JSON and also proprietary databases via ODBC and Rest APIs. Using these native integrations, DataMatch Enterprise can link datasets comprising of millions of records. 
  • Name and address standardization: Upon importing disparate data sources, DataMatch Enterprise has a plethora of data standardization and cleansing options including removing leading and trailing spaces, fixing casing errors, replacing zeros with O’s and vice versa, and much more. As for standardizing addresses, DataMatch Enterprise has a built-in USPS database that can be used as a referential dataset through which missing location details such as apartment number, street name, and ZIP code can be filled and standardized as per USPS guidelines.  
  • Establish unique identifiers: since it is possible for patients to be recorded under different medical identity numbers, DataMatch Enterprise lets you create match definitions and criteria based on proprietary fuzzy matching  algorithms to match non-exact records with minimal false positives. The result are more accurate matches that can be easily manipulated by changing the matching sensitivity.   


Healthcare providers face a critical challenge in accurately linking medical data silos across disparate systems to effectively administer patient diagnosis and treatments. Due to manual and obsolete data quality and matching processes, patient data becomes fragmented and mismatched, limiting healthcare interoperability.  

However, using a record linkage software like DataMatch Enterprise, data anomalies such as varied and incomplete patient address, middle name data, and other errors can be easily identified, cleansed, and standardized to increase matching accuracy.  

For more information, please get in touch with us today to enquire about how DataMatch Enterprise can be implemented for patient matching or download our free product trial to get started right away. 

In this blog, you will find:

Try data matching today

No credit card required

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Want to know more?

Check out DME resources

Merging Data from Multiple Sources – Challenges and Solutions

Oops! We could not locate your form.