DataMatch Enterprise Documents: Data Cleansing and Standardization

Now the user is ready to start manipulating the data you have imported.

    1. The field name is are the column headers from your data.
    2. Field type will automatically standardize the field based on the type you select:
        1. Full Name: this will parse the full name located in the field into Prefix, First Name, Common name, Last Name, Middle Name, and Suffix.
          1. The common name comes from an internal database that we and integrated with DataMatch Enterprise. This feature will try to resolve nicknames (ex Matt, Jimmy Bill) to their formal name (ex Matthew, James, William).
        2. FirstName: Selecting this will generate a new column and will relate the first name to a Common Name (Ex: Terry –> Terrance) as well as will generate a column for gender.
        3. Address: This works very well when you have multiple columns for addresses.  Selecting this will create new columns and distribute the address information in them.  The auto-created columns include the following:  Recipient, Street Number, Pre-direction (NE, S, N), Street, Post-Direction (NE, S, N), Street Suffix, PO Box, PO Box Number, Secondary Address Unit (Suite, Office, etc), Secondary Address Unit Number, Zip Code.
        4. Zip: Selecting this will parse the zip code in two fields, one for the traditional 5-digit zip and one for the additional 4 digits that may exist.
        5. Address and Company Validation: This feature is a premium feature and does not ship with the standard copy of DataMatch Enterprise. It is recommended to achieve the highest amount of validated addresses that the user applies as many “V” type addresses as possible.  The minimum amount should contain a VPrimaryAddress and a VZipCode.

    3. .   If you plan on changing several aspects of a field you may want to make a copy of the field in order to preserve the original data.
    4.  This option will reverse lower case to upper and upper case to lower.
    5.  This will change all letters to upper case.
    6.  This will change all letters to lower case.
    7.  Proper case will make the first letter of every word begin with an upper-case letter while the rest of the word will be in lower case.
    8.  This will remove all non-printable characters like carriage returns for example.
    9.  If you wish to replace non-printable characters with another character, you can enter the value here.
    10. For empty (NULL) fields, you can add in a more meaningful value.
    11. Remove leading spaces to avoid issues with matching.
    12. Remove trailing spaces to avoid issues with matching.
    13. In this field, you may enter specific characters you wish to remove
    14. Characters to replace is an enhanced find and replace function that will allow you to create multiple rules.
    15. Remove spaces will remove all spaces before, after, and between the text of a field.
    16. Remove letters will remove any letters that might exist in a field.
    17. Remove numbers will remove any numbers that might exist in a field.
    18. Replace Zeros with O’s will replace the number 0 with the capital letter O.
    19. Selecting this will replace Capital O’s only with zeros.
    20. Pattern Builder (Based on Regular Expressions) – This feature is described on the Pattern Builder page.
    21. WordSmith® – It is described on the: WordSmith® page.

Want to see DME’s Data Cleansing and Standardization in action? Check out this video.

How can we help?