Salesforce Data Cleansing 101: La guía completa sobre el «por qué» y el «cómo» de la calidad de datos en su CRM

Why 50% of all CRM initiatives fail?

With an average return of $8.71 per dollar invested in CRM software, a CRM is a must-have for any business to help understand their customers and provide a better experience. In fact, 91% of organizations with 10+ employees use a CRM.

And yet, Gartner and Forrester both report that nearly 50% of all CRM initiatives fail.

The leading cause of failure? Bad data.

The enterprise-wide implications of poor data quality are obvious, but very few organizations have a clear, comprehensive data quality strategy in place. And without a plan, no one knows who’s responsible for Salesforce data cleansing:

Everyone needs to be involved if CRM data quality is a priority. And if it isn’t yet a priority, read on to understand the impact bad data has on your bottom line every passing day, along with comprehensive solutions you can implement right now for a 66% increase in revenue.

How bad can your salesforce data really be?

01. Contact data decay costs millions each year

While implementing Salesforce CRM, you may have gone to great pains to ensure high data quality. But within a year, 30-70% of your contact data will have decayed.

People change homes and jobs, they get married and change their last names, their phone numbers change, their email address would change if they switch jobs and you had their business email in your database, the company could’ve been acquired, your contact may have passed away, etc. There could be a ton of reasons for data decay – but it does decay very rapidly.

For you, that could mean millions in lost revenue by undermining lead nurturing, invoice payments, email marketing, promotional mailings, outbound calling, customer orders, and mobile messaging campaigns.

Let’s say you have 30,000 prospects in your Salesforce CRM and your average prospect-to-opportunity conversion rate is 10%. With an average deal size of $10,000 (quite basic in the B2B domain), you stand to make $9 million if you close 30% of the deals.

That is, if you don’t factor in data decay.

Ideal prospects in Salesforce30,000
Prospect-to-opportunity conversion rate10%
Average deal size$10,000
Total revenue before decay$9,000,000

Unless you can work on all those 30,000 prospects and close then and there, you are going to lose revenue. The percentage of data decay ranges anywhere from 2.5% per month (30% annually) to 5.8% (70% annually) depending on which study you look at. Let’s stick with the most conservative estimates (2.5% per month) for the purposes of this example.

In just a month, you will have lost a quarter million dollars ($225,000) in potential sales revenue. At the very least. That’s $2.7 million in a year – unless you’re following data hygiene best practices.

A Walker study predicts that, by 2020, companies will focus on customer experience more than product or price to create a competitive advantage. And the only way to do that is by leveraging data – high-quality data. Address your contact data decay right now if you want to stay competitive.

Start by getting a clear picture of your CRM data hygiene. A thorough diagnosis will help you understand the degree of Salesforce data cleansing required. Use data to paint a picture for you.

For instance, look at email bounce rates to assess the email address decay. Depending on your industry, bounce rates may range from 2 to 15%. If your email address data is decaying in Salesforce, you will notice:

Sometimes, your email address list may have fallen prey to decay and yet you don’t see it by analyzing bounces. Look at auto-replies too to filter out cases where your contact has changed companies and you’re receiving an automated “out-of-office” reply. This isn’t going to show up as “undelivered” in your reports, and yet, the effect is the same.

Another way to diagnose data decay is to sort your accounts by “Last Activity Date” in Salesforce. Note that you want to look at “human activity” only – not automated emails being sent to be counted as activity (the Salesforce community has come up with a number of interesting solutions). Sum the number of accounts that haven’t been touched in a year, for instance. If you go with the same 2.5% decay percentage that we used in our original example, some quick math right there can give you a pretty accurate idea of how many records have decayed.

Taken all the right steps to diagnose the issue? Next up:

02. Unstandardized data results in inaccurate reporting

Importing of unstandardized data is one of the biggest causes of bad data. Usually, this happens when:

When injecting data from other sources, formatting issues and structural inconsistencies from source data invariably creep into your CRM. Data quality issues increase during the export-import process too. Source to target field mapping issues can result in jumbled up data, special characters might be lost, and by the end of the process, your Salesforce instance will have a ton of additional junk.

Besides importing, unstandardized data can be a result of how data is entered in the first place. If you don’t have well-defined naming conventions (for instance, do you enter US, or USA, or United States of America, or US&A under BillingCountry?), or if you do and your sales reps aren’t sticking to them, you’ve got a lot of Salesforce data cleansing to do. Such issues will directly impact your ability to create accurate reports and therefore run your business effectively.


Taking up the same example, your sales reps are entering variants of US, USA, US&A in the BillingCountry field when a deal is closed, and others yet enter the name of the State within the US.

Say you want to segregate customers by country to pinpoint attractive regions and increase marketing spend there. A relatively simple thing to do in Salesforce. Run the ‘Account Summary Report’ and group by BillingCountry. Next, sum the accounts you get per group and you see the number of customers you have in each country.


Not so fast! Remember how each sales rep is filling up the field differently? Now you will see some customers in the US, some in the USA, some in Alaska, some in Canada, some in Quebec, and so forth.

You don’t have an accurate count of customers – because you didn’t set input standards! Processes that rely on data-segmented reports like customer profiling and personalized marketing campaigns break down because of unstandardized data.

First off, set input standards. Start with:

01. Applying household matching rules

02. Standardizing names of contacts

03. Standardizing addresses

Sit down with the people who use this data to come up with a comprehensive list of questions like the ones above that you need to ask when developing standards. But, that’s a proactive approach and applies to future data. To standardize existing data, you have several options.

Option 1: Use Salesforce’ data cleansing and standardization features

For instance, to standardize the country issue we discussed earlier (US, USA), go to the “Mass Update Addresses” screen via Search in your Salesforce instance. Search for values like US and USA and replace them all with “United States” as shown here:

Option 2: Use a comprehensive data quality tool that integrates with Salesforce

You could use it to profile, cleanse, match, standardize, and dedupe data by importing Salesforce objects and loading them back once you’re confident of the data quality. The tools Salesforce offers for data cleanup have very limited data cleansing and standardization capabilities compared to a fully fledged data quality software.

Option 3: Go for real-time validation

Enterprise-grade data quality tools offer an API that enables you to place the software’s data cleansing capabilities right in between your input forms and the database. The API helps validate data in real-time; you just need to specify the rules to be checked against incoming data.

03. Missing and incorrect data breaks down processes

In the example above, we talked about how you were unable to get accurate customer counts by country because input for the BillingCountry wasn’t standardized. Imagine if, for many of your customers, the field wasn’t filled at all, or was incorrectly filled?

If you actually have 2,000 customers in a region, your Salesforce report may be showing just 1,200 because the other 800 have no data in the BillingCountry field. Think about the damage to your business if you had taken the decision to forgo increased marketing spend in this region because your top 3 regions had 1,800+ customers, and according to your [bad] data, this region didn’t qualify with just 1,200 customers.

Another effect of missing data could be process-breakdown. Say you have set up automated billing in your Salesforce instance. If the billing address is missing or incorrect for any of your customers, the cycle breaks.

Missing data can either be caused by mapping issues (your previous system stored FullName but your Salesforce has separate columns for FirstName and LastName), or it could be tied to how your team enters data. The latter is most common. Unless you have data validation rules in place and have a well-thought-out data quality management strategy, missing data will remain an issue.

Start at the source by understanding where and why these issues arose. According to SiriusDecisions, by resolving data quality issues at the source i.e web-forms, businesses have increased conversions by 25%. Begin by enforcing standardized input in your web-forms, ideally through a combination of:

  1. Custom Salesforce picklists to ensure fields are filled with pre-approved data 2. creation of business rules for real-time validation using a data quality API that integrates seamlessly with Salesforce. We discussed this earlier too.

Next, you need to work on filling missing information in existing data. DataMatch Enterprise allows you to quickly profile your data so you can see exactly where and how much data is missing or inconsistent in your Salesforce Org.

To fill in the blanks, the software has built-in data dictionaries for items like State names, US and Canadian zip code and geo-location data for bulk address verification, and a host of other information. The address verification module is CASS certified, meaning we offer maximum accuracy up to ZIP+6.

You could also reach out to customers directly and get updated information. Keep this as a last resort – when you need a very specific piece of information and you are unable to get it through automated means. To go about this strategically, generate a Salesforce report for null values in the field of your choice. You now have a list of contacts you need to reach.

A lasting solution to missing data revolves around taking the right steps (as detailed above) and educating your employees about the cost of missing or incorrect data. If they understand the impact of a single missing or hastily filled field on the bottom line, they are a lot more likely to not just ensure better input but also come up with ideas on how data quality can be improved from the bottom-up.

04. Duplicates cause confusions and wasted resources

Put simply, you have duplicate data if you have multiple records for the same account, lead, or contact. These duplicates could exist in a single list, or across several data sources. Regardless, the fact remains that if you have duplicate data, your team is wasting time and resources and is causing missed opportunities.

94% of organizations have duplicate data in some form, and if you don’t already have a data quality initiative in place, you probably have anywhere from 10 to 30% duplicate records that cost businesses in the US upwards of $600 billion each year!

The problems caused by duplicate data are myriad. Say you have a global sales team. Somebody in from the US sales team creates an account for IBM, and teams from 3 other regions do the same. That’s valuable time wasted right there.

Or, say a prospect comes in and downloads a white paper. While filling the form, she puts in her personal email address. Later, she signs up for a webinar and uses her business email. And yet again, when downloading a free trial of your product, she uses her nickname “Peggy” instead of “Margaret”.

Your CRM probably has 3 different records for the same person now. And that same person might be receiving 3 entirely different sorts of communication from your team or your marketing automation workflows.

The result? Your salespeople are chasing the same lead thinking its 3 different lead, the lead is put off by the discrepancies in the communications she’s receiving, and we know that the large majority will never do business with a company that has delivered poor customer experience.

Case Study:

Emerson Process Management attempted to get their CRM in order once they realized that they had over 400 master records for nearly every customer! This huge variety of records was a result of creation in different locations or for different functions depending on how they were associated with the client.

The data architect at Emerson began the dedupe process by matching customer records in for connections and similarities from every possible angle and merging and surviving records to create “golden” records. As a result, Emerson was able to reduce duplication by 75%, increasing revenues dramatically.

To address the data duplication problem, start by asking the right questions:
Your first port of call could be Salesforce’s native duplicate management feature. Start by defining matching criteria/rules to specify what should be considered a duplicate. Salesforce offers some default rules out of the box. Monitor closely and add custom matching according to your data quality standards. Measure and improve – that is key if you are serious about improving ROI with your deduplication efforts.
While certainly effective, Salesforce’ deduplication management features work well for small-scale data quality projects. For more comprehensive data quality initiatives focused on improving the business’ bottom line, you need a dedicated industrial-strength solution that:

Integrating your Salesforce CRM with DataMatch Enterprise

Start improving business opportunities and customer experience across the board by fusing the industry’s fastest and most accurate data cleansing software with the industry’s leading CRM.

Salesforce data cleansing with DataMatch Enterprise

Start improving business opportunities and customer experience across the board by fusing the industry’s fastest and most accurate data cleansing software with the industry’s leading CRM. DataMatch helps refocus your attention from worrying about data and configuration to driving your core business, enabling your team to identify increased revenue opportunities from Salesforce data.

The unrivaled speed, accuracy and low cost of DataMatch Enterprise make matching and linking records from all your data repositories a breeze, thanks to the wide variety of integrations that DataMatch Enterprise provides out-of-the-box. Enhance your Salesforce data cleansing strategy by leveraging our native integration with the CRM and advanced record linkage features to find data matches across all supported repositories, regardless of whether your data lives in social media platforms and legacy systems or traditional databases and flat files. Data Ladder integrates with virtually all modern systems to help you get the most out of your data.

Try data profiling today

No credit card required

"*" señala los campos obligatorios

Este campo es un campo de validación y debe quedar sin cambios.

Want to know more?

Check out DME resources

Merging Data from Multiple Sources – Challenges and Solutions

¡Vaya! No hemos podido localizar tu formulario.