Poor address data is a complex data quality challenge that affects customers, businesses, and even the mailing service. The staggering amount of poor address data has made it compulsory for businesses to invest in robust address verification & standardization tools that will help them get USPS validated addresses easily & effortlessly.
Read along as we help you understand:
- The Cost of Bad Data
- The Problems with Address Data
- Root Causes of Poor Data Quality
- How do You Standardize Address
- What is CASS Address Standardization?
- How to Validate an Address?
- How to Verify an Address with USPS?
- Data Matching – The Most Critical Challenge for Address Verification
- A Case Study – E-Ideas Limited
- Business Strategies to Improve Your Address Data
Let’s dive right in!
The Cost of Bad Address Data
Each year, millions of dollars are wasted on poor address data. The USPS reports nearly 6.6 billion mail pieces were undeliverable in 2016 alone. Mailers spend over $20 billion on UAA mail, while direct costs to the USPS is over $1.5 billion/year. All this unnecessary cost is simply due to the fact that businesses do not have access to the right address data.
If you do the math based on this preliminary cost alone, you’re probably spending $$$$ in managing return mail costs alone – not to mention the operational cost of verifying information from customers and resending the package.
Some figures to consider:
The Problems with Address Data
It’s human nature to make mistakes. Most of the time, consumers are lax when it comes to providing their address information on physical or web forms. They may misspell a state name, write abbreviations, miss out a street number or forget their ZIP Code. It’s inevitable that some mistakes will be made and incorrect data will be entered.
Here’s an image of how a typical unstructured, raw address data looks like. Poor address data is a challenge that causes a severe strain on businesses and their employees. Imagine having to fix these very basic issues for every mailing campaign, promotional activity, and every customer report that you have to run. It’s not only mind-bogglingly frustrating but also counter-productive as you try to match and verify each address to ensure it’s accurate and complete. Data scientists and analysts or business users in need of this data must spend days and months fixing these issues.
Address data is often found to suffer from:
- Incomplete information (missing a street name, a block number, a ZIP code)
- Invalid information (fake addresses and ZIP codes)
- Incorrect information (typos, misspelled names, poor format such as use of abbreviations)
- Inaccurate information (inaccurate apartment or house numbers)
All these problems make address data one of the most difficult to tackle in a data source. Furthermore, it also significantly raises the cost of return mails while also hampering a business’s reliance on address data to make crucial business decisions.
Most of these problems occur due to user input errors and the lack of proper data controls in place.
For example, some people will choose to just write the ZIP code but not the complete address, some will simply forget writing the ZIP code, or some will write an incomplete address. Some give a fake address. Whatever the reasons for data errors, one thing is certain – for a business to use its data, it’s necessary that the data is clean and valid.
But structural errors is just one part of the problem with bad address data. Other issues could be:
- Address data that is valid, but no longer exists.
- Address that is structurally right but does not belong to the customer.
- Address that does not exist in the USPS database.
When this information isn’t checked at the entry stage, it affects all future correspondence, as well as the relationship with that customer. To rectify this, companies will have to spend time calling up each customer to update the data or have them provide the right information again. The problem is, companies are usually short on resources and this is not a very viable operation mode.
Eventually, it boils down to one thing – poor data is inevitable, but it can be fixed. There are plenty of address standardization tools out there that help companies fix poor data by correcting format issues and cleaning up messy data. The process is less time-consuming but may require a learning curve and a basic understanding of data matching, parsing, and deduplication.
Root Causes of Bad Address Data
Human errors are the main, but not the only cause of poor address quality. The challenges of accurate data capture aside, there are many more root causes such as:
According the Census Bureau, a typical American will move 11.7 times in their lives. As housing becomes more expensive and as Americans try to find suitable areas to live in, this number will get higher. Of these, only 60% of movers actually inform the USPS of their move in a timely manner.
Companies, therefore are stuck with address data that is not updated. If they are sending out a million bills or promotional mailers a month, they might receive 90,000 move notices in the same month. Worse, according to this percentage, 60,000 of those million customers will not have provided the right information to the USPS on time.
Assuming that the same customers are still with the organization, the company will have to keep updating its database and ensure that it has the most recent address to use.
Poor Data Culture:
It’s only recently that companies are starting the conversation on being data-driven – but that’s restricted to executive leadership. The employee at his/her desk is unaware at the level of data quality problems he/she is dealing with. Moreover, there are no business rules to adhere to when it comes to data quality. There is no training or education for employees to be data-driven and there’s absolutely no investment in data management tools like DataMatch Enterprise which can bridge the gap between IT applications and business management of data.
Mergers and Acquisitions:
When companies migrate data during a merger and acquisition, the likelihood of data quality errors increases. These mergers happen fast and problems are sometimes unforeseen. There is mounting pressure for consolidation, but no check and balance for quality – in fact, there is seldom a quality management framework in place.
How do You Standardize Address Data?
The process of updating and implementing a standard or format across your address data is termed as address standardization.
Ok, so definition out, how do you actually standardize data?
Well, there are two ways to it – the easy way and the hard way.
The hard way will include you transporting that data to Excel, applying formulas, and filters to fix the data. Don’t believe tutorials that tell you it’s, ‘super-easy,’ because it never is.
Take a look at this article as it teaches you how to fix errors on Excel. See the amount of time, effort, and technical knowledge you’ll have to possess to do basic data fixes? The more complex problems get, the longer it takes. If you have to deal with millions of rows of data, data cleaning might become your permanent job.
The easy way?
Use an address standardization software. Before you dismiss the idea, here’s why.
The software will obviously save considerable time and effort, but it does more than that.
Address data records are not simple errors. As in the example above, you’ve got thousands of rows that have issues. You need a solution that can just let you fix all those issues in one go.
If you’re using a best-in-class solution, you can standardize data by:
Assessing Errors via Data Profiling: Imagine being able to get a consolidated overview of everything that’s wrong with your address data. You can see columns with non-printable characters, or columns with negative spaces or even columns with letters in number fields. Data profiling lets you make informed fixes. Unless you don’t know what’s wrong, you’ll be making fixes in the dark.
Parsing Addresses to Resolve Specific Issues: Part of address cleanup requires you to parse or break down different parts of addresses (city, state, ZIP code etc) and fixing them at different levels. For example, with DataMatch Enterprise, you can specifically fix ZIP codes and ensure that it meets ZIP+4 or ZIP+6 postal codes.
Cleaning Up Messy Data: Clean up formatting issues, remove negative spaces and non-printable characters in one sweep. It’s imperative to clean up your address data and standardize it according to the USPS guidelines (see below) before you can verify it.
Removing Duplicates with Data Matching: Cleaning messy data is just part of the deal – the stressful part is weeding out duplicates. If you’ve got thousands of rows of customer data that hasn’t been sorted in a long time, chances are you’ve got duplicates and they are not always exact in nature.
Take a look at this table:
See how one customer has five different addresses entered in multiple ways? Now this is not something you can sort easily unless you use a powerful data quality tool.
Data Survivorship & Export: You should be able to easily create a master record and export it as a final list to your team without having to copy/paste or manually load it into an acceptable format.
CASS Standardization: Any address standardization software must have CASS Standardization. DataMatch Enterprise for example is a CASS certified address standardization solution with a CASS database that is updated every month.
What is CASS Standardization?
Software that corrects or matches addresses need to be certified by the USPS. This is done via the Coding Accuracy Support System (CASS) that the USPS uses to verify the accuracy of the software. A CASS Certification is a license for all software vendors that use the USPS to evaluate their address data quality and to improve the accuracy of ZIP+4 and five-digit coding.
Because the USPS updates its address data regularly, CASS Certified software vendors are required to annually renew their certification with the USPS. All certified CASS products are listed on the USPS website.
What is the USPS Standardization Guideline?
Here are the rules:
- Always put the address and the postage on the same side of your mailpiece.
- On a letter, the address should be parallel to the longest side.
- All capital letters.
- No punctuation.
- At least 10-point type.
- One space between city and state.
- Two spaces between state and ZIP Code.
- Simple type fonts.
- Left justified.
- Black ink on white or light paper.
- No reverse type (white printing on a black background).
- If your address appears inside a window, make sure there is at least 1/8-inch clearance around the address. Sometimes parts of the address slip out of view behind the window and mail processing machines can’t read the address.
- If you are using address labels, make sure you don’t cut off any important information. Also make sure your labels are on straight. Mail processing machines have trouble reading crooked or slanted information.
Address standardization is the pre-requisite of effective address validation. You need to ensure your address meets the USPS guideline before your data can be verified against the USPS.
Address Verification or Validation – What is the Difference?
You’ll often see the term, ‘validation and verification,’ intermixed when it comes to address data. The difference is more contextual than lexical. Data Ladder uses the term Address Verification to verify addresses against the USPS database. Other organizations verify addresses against billing records, driver’s licenses, bank statements, etc. That’s a completely different service and one that most companies don’t need.
Other vendors use ‘Address Validation,’ to do the same matching with the USPS to validate customer data. In the context of this guide, we’ll keep it to address verification.
Address Verification – How to Verify Address Data with the USPS
The address verification process is simple. You match your now standardized data against the government database or any other authority standard. If you’re in the US, the USPS is the only database that you should be matching your data against.
If your address data is clean and standardized, this process takes minutes. If you’re using DataMatch Enterprise, you can match the whole address or just parts of the address that is based on 50 active elements including geo-coded locations, meaning you can verify addresses right to the T!
Some of the most popular fields against which our clients often require verification against include:
V Status – Is the record verified (Yes/No)
V Residential Delivery Indicator – Defines if the residential address may receive direct deliveries to the door
V Company Firm
V Primary Address
V Secondary Address
V Zip Code – 5 digits (USA)
V Postal Code (Canada)
V Plus4 – Additional 4 digits associated with the 5 digit Zip Code
There are 54 fields that you can use to validate your address data.
Once you match address list with these components, you’ll be given a return value that will indicate:
- 10 = Invald Address
- 11 = Invalid ZIP code
- 12 = Invalid State Code
- 13 = Invalid City
- 21 = Address not found
You’ll also be prompted with warnings as:
B# City/State Corrected
C# Invalid city/state/zip
D# No ZIP assigned
E# ZIP assigned for multiple response
F# No ZIP available
G# Part of firm moved to address
H# Secondary number missing
I# Insufficient/incorrect data
J# Dual input
If you’d like to know more about this, feel free to hit us up for a quick demo!
Ok, so moving on:
Data Matching – The Most Critical Challenge for Address Verification
The customers that come to us have always one complain – they are never able to get a good match rate. And we agree!
Data matching is still an area of improvement. There are very few vendors who can give a 100% accurate matching rate. You really need that figure, if not, at least 95%. The reason being that for the verification to work, your address field must find a match with the USPS. If most of your matches are missing because the software is relying on exact or deterministic matches, then it’s not going to work in your favor.
Therefore, when choosing an address verification & standardization software, you must be able to assess its data matching rate. Of a hundred rows, how many rows did the tool miss, and why? Chances are you’ll see that the software fails to pick up near or close matches and relies solely on exact characters to identify a match.
Data Ladder’s DataMatch Enterprise is primarily a data matching solution that has been used by government institutions and Fortune 500 companies as HP, Coca Cola, Deloitte and many others. We’re known for matching data up to a 100% accuracy rate. That’s because Data Ladder uses a combination of fuzzy matching algorithms and its established proprietary algorithms to identify even the most distant probable matches.
P.S – Data matching is resource-intensive. Save your team time and manual effort. Learn how in this detailed blog post.
Here’s a case study revealing just how challenging it is even for a data supplier to ensure accurate data matching.
A Case Study – E-Ideas Limited
We spoke to Artem Axenov, Operations Manager at E-Ideas Limited, a boutique B2B marketing agency based in Wellington. The agency manages a large database of businesses for marketing purposes which means they have to take extra care of address data – a significant challenge that involves lots of manual work on Excel.
1. How does your agency deal with the problem of bad data?
We often deal with clients who already have a list of customers, but the data is badly formatted. There are a few automatic tasks you can do to resolve it but in the end, it’s a manual job. First, you need to decide what format you’ll use. Then the simplest way to fix badly formatted data is to sort it one column at a time and then make the required changes to bring it up to scratch. There are some formulae in Excel that help to split up or combine data – to split you can use MID and LEFT together. And to combine data you can use CONCATENATE.
By sorting data first you’ll group together sets of addresses that have the same formatting issues and this makes it far easier to deal with them at once.
2. How has your experience been with address verification and validation tools?
Our experience with any type of address validation or verification tool has always been a mixed bag. At the end of the day, no tools we’ve used have managed to produce a high match. And this comes down to vastly different ways of storing addresses. They are useful in making a head start on the process but in the end, there is always a significant amount of manual work involved to finish the job.
3. What’s the most troubling data matching problem?
The main issue is whatever automatic matching is done if the data isn’t formatted in the exact way the tool is programmed to identify it then the match doesn’t go through. This could be as small as Street being recorded at St, Avenue as Ave etc.
4. What kind of manual tasks do you have to do after using an address validation software?
Usually it’s just a matter of looking through the data with a human eye to catch out any inconsistencies and correct them. In NZ for instance, the postal service has a very specific format that addresses need to be kept in to get the bulk mail discount. Nothing is complicated but again small things like Street being recorded as St will be counted against you. Or another example is if you have your PO Box recorded as P.O. Box – it doesn’t recognize this as correctly formatted. Even things like leading or trailing spaces can count against you – and some of those are hard to pick up because when you’re looking at the address you can’t see what’s wrong!
5. How has bad address data affected your business?
We’ve only encountered problems as far as having to put in extra man-hours to get data up to scratch to qualify for the postal discount. There is a test it has to pass called the Statement of Accuracy – which verifies the data automatically to ensure 80% of it is correctly formatted. We’ve had a number of cases where we have ended up spending days longer manually formatting data to ensure it’s correctly formatted.
The practice we have implemented now is to store all our data in the correct format. This took us a long time to get everything to this standard but it now means when we deliver data to our clients it’s NZ Post ready and there is no further work to be done.
This agency’s struggles with bad address data result in extra man-hours that affect operational efficiency. Despite the use of address validation tools, the inability to produce a high match makes it very difficult to validate address data. Hence, it’s necessary to choose a tool that allows the user full-fledged capabilities of data preparation and standardization while also returning a high match. This is only possible with best-in-class data preparation and matching software like DataMatch Enterprise that allows the user to prepare and clean address data while also returning a high match result even with erroneous text.
Business Strategies for Address Data Management
Bad address data is a data quality problem. While you can use tools to make fixes, you’ll still need to implement business strategies to curb bad data affecting operational processes. Some of these strategies can include:
The first step towards quality is training – make sure people who are handling, interacting, using and entering data know the impact they have in the process and on downstream applications. They need to understand the consequences of bad data on the entire organization and not just on one member or customer. Employees practicing data quality rules should be rewarded and appreciated.
Tool List for Data Management:
Having tools around that can help business users and IT professionals alike manage the data is crucial. Identify the tools you need for data cleansing and data management to help both IT and business users have a non-intimidating relationship with data.
Involve Business Users in the Quality Process:
Data is not just an IT problem. Business users are equally responsible for managing data. In fact, they are the sole owners of customer data that is often used in marketing and sales purposes. This is why they need to be involved in the process and also need to be trained for using data management tools.
Set up a data governance team to create a data management plan and ensure that the organization follows the plan where each employee understands the plan, their rule within the plan and the expectations that come along with the role.
Lock Down Data & User Roles:
If anyone in your team can open up the CRM or the data source, muddle around with data and leave no footprints, you are in for serious trouble. It’s necessary to create master data holders who have the rights to access, enter or process critical data. This should come in the data management plan.
You’re not a victim to bad data. If you only accept the gravity of the situation, cultivate a data-driven culture & strive to manage the challenges that come with data management, you can very well get data that requires only basic clean up to be put to use.
How Does DataMatch Enterprise Help?
Our product is CASS Certified, meaning we meet and exceed USPS requirements for address quality and accuracy. We also help you with bulk address validation ensuring elements such as Zip codes, town, and city names are verified and validated. The best benefit of using Data Ladder‘s DataMatch Enterprise? The software finds and matches data even if it is incomplete with a 96% accuracy rate. Furthermore, you can use the software to get real-time address verification ensuring you have correct addresses in your database.
Using algorithms that determine a match based on areas of similarity, our platform makes sense of unusable data and deriving connections from between datasets. Whether it’s spelling errors or incomplete zip codes, abbreviations or typos, we sort through large amounts of data to help you make sense of your data.
Bad address data is inevitable, but that doesn’t mean you should let it affect your business performance. Manually fixing address data will cost you more time and effort, and yet you won’t be able to standardize or verify it unless you use a CASS certified solution.
Don’t drown in bad data. We’re here to help.
To see how we can help you with address verification and standardization, get in touch with one of our solution experts today and see how we can help you get address data you can use for its intended purpose.