Major merge purge software tools on the market provide users an efficient way to combine and remove data within large data sets. In a recent independent study, Data Ladder’s merge purge software DataMatch Enterprise outperformed major companies such as IBM and SAS on both accuracy and speed.
Merge purging is the process of combining two or more lists or files, identifying and/or combining duplicates and eliminating (purging) unwanted records. Through phonetic and fuzzy matching algorithms, human errors such as spelling or typos will be located and removed from data. From merging records from multiple lists to purging addresses from others, merge purging data helps clean the underlying data set in order to achieve several goals, including:
- – Improve productivity
- – Reduce duplicate mailings
- – Increase customer satisfaction
- – Drive efficient marketing campaigns
What Does Data Ladder’s Merge Purge Software Do?
DataMatch allows the user to take control of their merge parameters, and use the matching tool based on a degree of similarity through fuzzy logic algorithms. Whether the user needs to select records based on the data updated, or create their own match codes, DataMatch’s merge purge tool gives control to the user.
The task of merge purging a database can be an expensive and time consuming task, but with data quality tools like merge purge software the task can be achieved easily.
A few of the features of DataMatch’s merge purge software tool include:
- – Create a cross-section of lists and select specific names from designated group of records or files
- – Look at matches and make decisions based on particular needs
- – Work with all types of lists
- – List suppression and record removal
Merge Purge: Best Practices
Understanding best practices in the merge purge process is often the first step in achieving success with this data cleansing process. One of the first steps in the merge purge process is combining different databases with different sources (such as an SQL server, MySQL, Excel, ODBC).
DataMatch will import, combine, and export to the most common database formats. DataMatch will automap similar fields from different data sources together. Here are a few of the best practices for merge purging data:
- – Fuzzy logic identification of percent matches between records and setting minimum percent match thresholds by field
- – Acronym identification for matching
- – Cleaning and standardizing data prior to matching (street to street, eliminating unnecessary syntax in phone numbers, etc.)
- – Applying libraries for standardization, especially for first names (such as Jon, Jonathan, and John
Survivorship is critical to a successful merge purge. While removing duplicate records it needs to be decided which piece of data should stay (survive) and which one should be deleted.
In the above example you have two examples of the same data, with small differences. A single master record must be chosen to maintain data quality. With DataMatch, you can select which field of data survives, which field to merge on (in our example, it would be the customer number) and in which order (ascending or descending).
While standard merge purge software can remove important business data, DataMatch will retain and keep all pieces of information from the same master record — in a new data field.
The final result would look like this:
All of the alternate information is captured in a brand new field, so the user has the benefit of a single master record without data loss. DataMatch never deletes any information from the source files; all information is kept temporarily in memory, so the user can test various merge purge settings.
DataMatch also offers several output options, including:
- – Creating duplicate records to separate file
- – Working with different field sizes and list structures
- – Merged mailing lists