Customer personalization is a digital business imperative that requires clean, updated, enriched customer data. The data enrichment process, part of a data quality management framework allows companies to augment third-party data or consolidate existing data to get accurate, insightful information on their customer records.

In today’s competitive marketplace, the business that knows its customers best leads the way.

But data enrichment is not an easy undertaking. To be a valuable initiative, a key component in the process must be prioritized – that of data quality. For a data enrichment process to be successful, you’ll need clean, reliable, usable data.

This post will help you understand the role of data cleansing in the data enrichment process and why it’s so important to focus on data quality.

What is Data Enrichment?

A CRM generally holds basic contact information such as names, email addresses, phone numbers, etc. Businesses need more information to know their customers better.

First Name Last Name Phone Email Address
Randy Hanscom 231-821-8063 [email protected] 4085  Bee Street, Holton MI
Mark Watkins 605-857-1883 [email protected] 2911  Hartway Street, Yankton

For instance, where does Customer A works? What is their job title? What is their favorite tech brand? What is their marital status or income level? Demographic and firmographic information enriches the CRM, allowing the organization to get deeper insights about its customers, which it can use to increase sales and marketing performance.

Here’s an example of a database enriched at a basic level. Additional information about the customer’s company and the average income is added to contact records to get a complete overview of the customer profile. As the organization expands its customer service or personalization goals, household data such as the customer’s marital status, family status and more will be added to this database. The ultimate goal is to create a customer 360 view.

First Name Last Name Phone Email Address Profession Company Salary (avg)
Randy Hanscom 231-821-8063 [email protected] 4085  Bee Street, Holton MI Marketing Manager Group Buff $4000
Mark Watkins 605-857-1883 [email protected] 2911  Hartway Street, Yankton Sales Director Syncamore $8300

Enterprise-level organizations such as banks, insurance, and financial companies have a hard time consolidating and enriching their data because it’s stored in disparate systems.

Airlines and travel companies are the perfect examples of organizations that have disparate data across four or five databases. To enhance passenger experience, airlines will need to consolidate passenger data such as full names, addresses, exact birthdays, seat assignments, credit cards, and any information that should make personalization easy. Most airlines though do not have a robust data management or enrichment plan in place to make use of this data to propel ML or AI initiatives. As enterprises step up to benefit from big data, ML and AI, they need to enrich their CRM with lifestyle and behavioral data to deliver personalized services. For this to be possible though, these organizations will need to clean, update, and manage both structured and unstructured data.

Data enrichment, therefore, can be summarized as an activity that takes a large amount of disparate, structured, and unstructured data from various sources and turning it into something of value.

The Process – How is Data Enriched? 

There are two ways you can enrich data.

  1. Build on the existing wealth of data
  2. Use third-party data services.

Your organization gathers a wealth of customer data every day. Every sales chat, every social media interaction, every sign-up is recorded in CRMs, ERPs, and spreadsheets. While marketing and sales may be working with customer data on a CRM, billing or logistics may be using the ERP to store information. This disparate view of the customer is a roadblock to customer experience and personalization goals. So it’s important to first navigate internal systems to weed out this information, clean it (meaning fix typos, errors, remove obsolete data), purge unnecessary data, and merge it to create a single version of the truth.

For instance, if you’re using ZOHO, that information could be lost in the dozens of modules and fields that may have been set up by team members over the years. Data enrichment in this context, would mean reviewing the CRM data, cleaning up inconsistencies, merging and purging fields or unnecessary data and creating a Golden Record (that is the most perfect, accurate record) of each of your customer.

The other type of data enrichment would involve using third-party services to provide you with firmographics (information about a client’s company), demographics, and technographic (their technology use). These services provide lists with added verticals that enrich your data such as job function and job level, and even buyer persona, to see your prospects in a whole different light. These services do the hard work for you – they pull information from social media channels, Google, and other public domains, allowing you to get complete information about your audience.

Whichever method you’re using, you must be careful of the quality. If you’re extracting data from internal systems, you must check them for duplication and data hygiene issues (spelling errors, typos, unstructured formats etc). If it’s third-party data, it must be clean, verified, and validated.

There is a process to data enrichment that if followed will result in the ultimate golden record. The following process is a framework provided by Data Ladder’s DataMatch Enterprise, a self-service data preparation and data enrichment software.  

Step – 1 Establishing the Data Enrichment Goal

Customer personalization is usually the key motivator for a data enrichment goal. In our experience though, the narrower and more focused the goal, the easier it is to implement the data enrichment process.

For instance, an insurance company wanted to enhance their customer experience by initiating a health insurance plan for college-going students. The firm first had to decide on the kind of data they needed and the level of segmentation needed to achieve this goal. At a very basic level, they needed genealogy records, family records, property and income records, electronic health records, behavioral and lifestyle records. Most of the records were already be available in the insurance company’s repository. Behavioral and lifestyle records were obtained from third-party data. When the collection and segmentation were done, the firm began the cleaning, merging, and purging process of consolidating internal data with third-party data.

For data enrichment to work, it’s imperative that goals, audience segmentation, and data sets are clearly defined.

Step 2 – Preparing Your Data Using Data Enrichment Tools

This is the most important part of the process.

If you don’t already have a data quality plan in place, you will need to prepare your data before any enrichment takes place.

Using DataMatch Enterprise, you can clean, dedupe, and consolidate disparate data sources to make the data good enough for merging with a new data set.

The data preparation process involves:

Data Integration: Integrate your data source such as a Salesforce or HubSpot CRM directly on to the platform. You won’t have to waste time with manual extractions of data. Simply plug in the data source and start preparing the data.

Data Profiling: This is a crucial step to data cleaning. It helps you assess the quality of your data and identify problems within the data source at a row level. For instance, you can see which of your data rows have missing email addresses or ZIP codes. You won’t know what to fix if you can’t see what’s wrong and data quality issues are so deeply ingrained, that it simply misses common observation. You might not even know some of your fields have non-printable characters or punctuation marks within your phone or address data. These invisible problems later become a bottleneck and make it difficult to ensure a smooth enrichment process.

Profiling dirty data

Data Cleaning: Raw data is inherently flawed. It needs to be cleansed of typos, spelling errors, and format inconsistencies. Consider data cleaning as much-needed data transformation (transforming all low-case names into upper-case, or removing all unnecessary punctuation marks in a field). Once you clean the data of these inconsistencies, it will be usable and valuable.

Data Deduping: It’s quite probable your data has three to four copies of a single entity. If you’re consolidating data from multiple sources, then you’ll definitely end up with duplicates. So before adding more data, match existing data to remove duplicates and make sure there is a unique record for each entity. Duplicate data is dangerous. It leads to skewed analytics and makes it difficult to get the complete picture of your audience.

Cleaning and deduping lists

Data Merge Purge: If you extracted data from multiple sources – say from a CRM and an ERP, you’ll need to merge them into a single record. Remember though that both sources need to go through the cleaning and deduping process above before it can be merged. Once you merge these records, you can purge the unnecessary fields and only keep the fields you need. This process, historically, was performed via Excel or via complex data management solutions that would take months to get done. A self-service data enrichment tool like DataMatch Enterprise lets you accomplish this in just a matter of minutes.

data enrichment, Data Enrichment Guide – How to Enrich and Optimize CRM Data for Accurate Insights

Use Merge Purge Software to clean data across the enterprise

Data Survivorship: This is the final stage of the enrichment process. Assuming your third-party data is clean (if not, you’ll need to run those through the process above too), you can now merge that record with your final record and create a master record. This master record can be exported to your database of choice and you now have reliable data to use in your campaigns!

It’s necessary to reiterate a key point – any data source, whether third-party or your own data must be clean, unique & updated before it can be used for any enrichment purpose.

If you’re unsure of the quality of third-party data, check the first 100 or 200 samples to see if it has quality issues. If it has duplicates, formatting, or structural issues, you’ll need to clean and dedupe it. This is especially the case with data obtained from social media sites and online listings.

A word of caution: Any attempt to match two first-party datasets, must have a common factor that links the two datasets together. This could be a common name, a phone number, an ID or an email address. Missing this common factor will void the whole activity because you won’t know what factor refers to the same customers in two different datasets.

Step 3 – Keeping Your Data Updated

Once you’ve created clean records, you’ll need to keep your data updated. Data enrichment is not a one-time effort. Customer data, no matter how detailed, is fundamentally a snapshot in time. Income levels rise and fall, marital status may change, and the type of car and physical address can alter. Even names may change, especially if there is a change in one’s marital status.

Also, as you progress through your campaign or goals, you’ll realize that you need to augment it further. You don’t want to have to go through the whole cleansing, deduping process every single time. To avoid unnecessary complications, it’s recommended to keep a data cleaning schedule. If your first-party data is clean and optimized, you’ll just spend a few minutes cleaning up third-party data and appending it to your database.

Almost every Fortune 500 company we’ve worked with is striving to accomplish data enrichment objectives, but only a few get the process right. Others collect data and dump it in a data lake where it lies dormant until it is decayed. With a data decay rate of 2.1% per month and 22.5% per year, you cannot afford to lose data you’re painstakingly collecting. If you’re not updating your data at regular intervals, you’re losing value. 

Why Data Quality Matters?

You don’t have to wait for grand goals like personalization to maintain the quality of your records. If you implement a strong data quality policy now, you’ll be preventing costly mistakes later. In fact, keeping your records up-to-date will directly impact your customer service and experience. Take a small instance – if you’re sending out a launch email and your list consists of 5,000 obsolete records, you’re missing out on millions in potential sales revenue. Data quality practically saves you on revenue loss!

Data Quality Issue Example
Invalid value Valid value can be “1” or “2”, but current value is “3”
Cultural rule conformity Date = 1 Feb 2018 or 1-1-18 or 2-1-2018
Value out of required range Customer age = 204
Format inconsistency Phone = +135432524 or (001)02325355

Regardless of structure, type, or format, source data intended for enrichment should be validated in terms of the following key attributes:

  • Relevance: Is it relevant to its intended purpose?
  • Accuracy: Is it correct and objective, and can it be validated?
  • Integrity:  Does it have a coherent, logical structure?
  • Consistency: Is it consistent and easy to understand?
  • Completeness. Does it provide all the information required?
  • Validity: Is it within acceptable parameters for the business?
  • Timeliness: Is it up to date and available whenever required?
  • Accessibility: Can it be easily accessed and exported to the target application?
  • Compliance: Does it comply with regulatory standards?

Apply these quality check metrics to both first-party and third-party data and ensure the quality of your data is fit for its intended purpose.

To Conclude, Enrich Your Data, But Don’t Forget Data Quality

Missing, incomplete, and outdated records are the primary detractors to customer data quality. If you truly want your data enrichment to succeed, you’ll need to ensure that quality issues are taken care of first. Data enrichment is not a one-time process. Like everything else, it will require you to maintain an updated version of the data to be useful and effective.

Want to know how we can help you kickstart the process? Talk to our solution architect today and get a free demo of our self-service data enrichment tool.

Farah Kim is an ambitious content specialist, known for her human-centric content approach that bridges the gap between businesses and their audience. At Data Ladder, she works as our Product Marketing Specialist, creating high-quality, high-impact content for our niche target audience of technical experts and business executives.