Every data cleaning situation is unique

A quick post today.

One often overlooked fact is that every data cleansing situation is very unique.

Take a simple customer deduplication (removal of duplicates) exercise. At first glance it is a very simple problem. Identify the duplicates, and remove them. However once you get into the details you realize there are several items worth considering.

1. How do you identify a duplicate? Is it the company name? Contact name? Address? Maybe you deal with 2 completely different offices that are the same customer (IBM in Australia and in the UK for instance)

2. Do you want to remove all information about a duplicate contact? There may be important contact information or customer notes associated with the record.

3. Have all affected stakeholders in your organization been made aware that the cleanup was occurring? There may be individuals and departments inside your own organization who should be notified to insure no unintended consequences occur.

4. Are there any new standards you’d like to apply? Capitalizing street suffixes, separating full name fields to a First and Last name field, etc.

Note that Data Ladder is here to walk you through these issues which is why we give free personalized WebEx demonstrations addressing your specific data cleansing activity.

Any other big questions that I missed? Feel free to comment below. We welcome and thank you for taking part in the conversation.