DataMatch Enterprise Documents: Pattern Builder

 

The Pattern Builder, based on regular expressions (RegEx), is a powerful feature that will allow users to build and search for patterns within a column and parse the found match into a new column.  This feature uses a graphical interface and requires no intimate knowledge of regular expression syntax. In short, users may create complex, powerful patterns/regular expressions by simply dragging and dropping.

For those who may be new to regular expressions: A regular expression in DataMatch Enterprise is a special text string for searching a specific pattern in a column of data.

There are many system patterns that a user could deploy.  They have been organized into categories.  The custom category contains existing patterns built by the user.  To create a new pattern with the Designer, click New and select Create with Regexp Designer.  Users with knowledge of the RegEx syntax may select the Create Manually option in order to bypass the graphical interface.

Patterns are created by adding/dragging “blocks” from the left-hand side pane to the middle/lower pane.

  1. In this pane, there are different categories of blocks that the user may deploy in order to create a pattern
  2. The user may place a target pattern in this window that he or she wishes to create
  3. This is the results window where the output/matched pattern will show
  4. This is the construction window of blocks.  In this window, the pattern is created.
  5. The General options are as follows:
    1. Greedy – when set to true, the blocks will try to match as many characters as they are allowed/defined.  Conversely, when set to false, the blocks will try to match as few characters as possible.
    2. Quantifier – This will define how many characters the block may match.  In the example above, the farthest most block to the left (Letter) has been set with a quantifier of “One or More.”  This means that this block must match at least 1 letter but may match more than one (without a limit)
  6. Named Groups include the following options:
    1. Add Name (True/False) – Setting this to true will allow the user to add a named group
    2. Group name – This will be the name of the block, and this name will also be the column header on the export.  There are a few caveats:  The named block must not start with a number, there must not be any spaces in the named group and you may not name multiple blocks the same name.
  7. Range Options is used for 3 cases that are related to the Quantifier option.
    1. From N to M
    2. At Least N
    3. Exactly N

Teaching moment explained:

      • Letter (One or More) matches the name Washington (Named Group: LastName)
      • Not Alphanumeric matches the comma (“,”)
      • Whitespace matches the space
      • Letter (One or More) matches the name George (Named Group: FirstName)
      • Whitespace matches the space
      • Not Alphanumeric matches the andpersand (“&”)
      • Whitespace matches the space
      • Letter (One or More) matches the name Martha (Named Group: FirstName2)

Output Result

There will be three new columns LastName, FirstName, and FirstName2 with values Washington, George, and Martha, respectively.

Want to see DME Pattern Builder in action? Check out this video.

How can we help?