Blog

A guide to master data management: What, why, who, and how

The need for master data management

A recent Deloitte Digital report shows that an average business uses 16 applications to leverage customer data and about 25 different data sources to generate customer insights. As the number of data tools increase, businesses struggle to enable central and efficient data management architectures across their organization.

Having delivered data solutions to Fortune 500 clients for over a decade, we have encountered various data issues that businesses struggle to resolve. For most companies, the biggest and the most common data challenge is the same:

Building a unified view of core data assets.

Simply put, businesses need to find a way to consolidate the disparate records of their main data entities, such as customers, products, locations, etc. Moreover, they want to set up data systems that can sustain this consolidation accurately and efficiently over time. This process is often termed as master data management.

In this guide, you will find answers to all your questions about master data management: what it is, why do you need it, how can you manage it, and how it compares to other disciplines. Let’s begin.

What is master data?

The processes or transactions happening in a business always involve a certain set of entities or concepts. Depending on a business’s line of operation, these entities may differ, but generally, they include following data assets:

  • Customer
  • Product
  • Employee
  • Location
  • Other
    • Vendor
    • Supplier
    • Contact
    • Accounting item / Invoice
    • Policy

These items are usually termed as master data. All tasks, processes, or transactions being performed in a business involve one or more of these master data objects.

Example of master data objects

As a master data management example, consider this transaction:

Customer A buys product X from location Y.

For this transaction to be processed accurately, a company must have their customer, product, and location information in place; despite the fact that this data is probably stored in three different applications or databases.

What is master data management?

The term master data management (MDM) is best described as:

A collection of modern data management practices that:

  1. Support data capture, integration, and sharing between disparate data sources,
  2. Ensure data quality (such as accuracy, validity, and completeness), and,
  3. Implement data governance rules to allow authorized access, information management, and other administration workflows.

This definition highlights the core master data management features:

  1. Data integration (involves data capture, transfer, consolidation, etc.),
  2. Data quality management (ensures data quality metrics, such as data accuracy, completeness, validity, uniqueness, consistency, and timeliness).
  3. Data governance (defines data access control, sharing, provenance, etc.),

How does master data management work?

An MDM enables systemized and centralized data management across an organization using various master data management techniques. It acts as a central, intelligent hub connected to multiple data sources and applications used by the company. To understand how an enterprise master data management system (whether strategically or technologically), you need to consider the system design and responsibilities of such an architecture. Let’s take a look at these MDM concepts:

1. Master data management design

With 2.5 quintillion bytes of data being created roughly every day, we definitely need a systemized way of capturing, storing, sharing, and synchronizing data. One of the most common challenges associated with data is maintaining one definition about the same ‘thing’ across all nodes or data sources.

For example, if a company uses a CRM and a separate billing application, a customer’s record will end up in the databases of both applications. The task of maintaining a consistent – or simply, the same – view of customer information across all databases over time is difficult.

a. Avoid complex data topologies

Such consistency requirements can lead us to create connections between siloed applications where every update is synchronized throughout the system. This architecture gives birth to a complex topology where an exponential number of interactions happen between nodes every day, like this:

centralized data management

b. Enable centralized master data management

This clearly highlights the need for a central, intelligent hub (or an MDM) that models and preserves data objects, as well as serves data retrieve and update requests linearly, hence, easing data management.

centralized data management

2. Responsibilities of a master data management system

The responsibilities of such a system include:

  • Modeling data objects – especially for the main data assets.
  • Maintaining data hierarchies or relationships between data objects.
  • Connecting to all data sources or applications.
  • Tracking changes made to any connected database.
  • Processing changes made to connected databases and intelligently synchronizing updates.
  • This may include validating, formatting, standardizing, or matching the new data before it is replicated in other nodes.
  • Enabling governance rules for automated alerts, moderation workflows, data sharing capabilities, etc.

What are types of master data management?

Depending on the purpose an MDM serves for an organization, an MDM solution can be implemented in different architectural or hub styles. The most common master data management types are mentioned below:

  1. Registry style: With this style, data is not copied or moved to a central hub; rather, the MDM maintains an index (or a registry) that points to the master records stored across distributed systems.
  2. Consolidated style: With this style, data records are consolidated in MDM but are not synchronized or fed back to source applications; rather, they are sent to downstream apps that use data for reporting or other BI purposes.
  3. Co-existence/Hybrid style: With this style, the master or consolidated data records are kept in the MDM, but they are also fed back to the source applications.
  4. Centralized style: With this style, the master or consolidated data records are kept centrally in the MDM only, and can be accessed by source applications as needed.

In addition to master data management architecture, MDMs are also categorized based on what they are used for in an organization. For example, some MDMs are operational since they are used in routine data operations and are heavily focused on providing a consolidated view of core data assets to everyone who handles data in an organization. Other MDMs are analytical as they are used for analytics or business intelligence purposes in an organization.

Why is master data management important?

Before we move on to more conceptual and implementation details of master data management, it is first necessary to understand its importance. Since this initiative requires company-wide buy-in, and in some cases, a huge investment – in terms of time, cost, and other resources – you need to onboard important stakeholders to this initiative. Let’s discuss some of the core master data management benefits.

1. Comprehensive view of main data assets

By far, the greatest benefit of master data management is the ability to access a comprehensive and complete view of any data asset at any given time. This can be a complete view of your customer profiles, product lists, employee information, location details, or any other data asset critical to your business.

2. Efficient business operations planning

When information is scattered across multiple sources, it gets almost impossible to predict and forecast future business needs; especially if data assets such as vendors and suppliers are crucial for your business operation. If important data assets become centralized and aggregated, you can plan business operations effectively and efficiently with the help of a single data store.

3. Increased business agility

Data has always played a key role in finding new growth and expansion opportunities for a business. But if master data assets are not managed properly, it can be quite impossible to uncover hidden market opportunities. Alternatively, if you manage your master data centrally, it gets easier for your team to improve competitiveness and business agility through quick and timely data analysis.

4. Improved operational efficiency and business productivity

Oftentimes when same data resides at separate locations, team members are required to fetch and gather data from all sources before they can start working on their tasks. At other times, different members end up working on the same task, not knowing that it is already being handled by someone else on the team. Both these problems reduce operational efficiency and business productivity, and master data management is something that can help resolve such issues.

5. Effective decision-making and reporting

When a company’s business intelligence tool outputs inaccurate or biased results, it is usually an issue with data replication or decentralization. To attain clearer and faster results for accurate and timely decision-making, it is quite necessary to input high quality and centralized datasets into your business intelligence system.

6. Timely data compliance and governance

Data quality, governance, and compliance are tightly integrated with each other. You cannot comply to federal or organizational data standards (such as GDPR, HIPAA, or CCPA) if your business does not possess well-governed and high-quality data – something that is made possible through master data management.

How to implement a master data management system?

The master data management process can be quite complex and requires involvement of all key stakeholders. Simply put, it consists of the following seven master data management steps:

process of implement MDM

1. Planning master data management

While implementing an enterprise-wide initiative like MDM, you need participation from important stakeholders – especially the ones who are hands-on with data at your company.

Before you can deploy MDM practices and tools, you need to collect master data management requirements and build a plan, which involves:

  • Identifying the people who are generators and recipients of master data at your company.
  • Coordinating with stakeholders to understand the current state of data.
  • Constructing a case that justifies the impact of MDM initiative in support of business objectives.
  • Preparing comprehensive plans for:
    • Master data object models,
    • MDM architectural style,
    • Data integration or migration plan to/from involved databases.
  • Getting proposed plans approved by stakeholders involved.

2. Coordinating with data stakeholders

There are numerous people across an enterprise that are considered to be important stakeholders, and must be involved at this stage. Such people include:

  • Business development executives
  • Senior managers
  • Information architects
  • Data stewards
  • Metadata analysts
  • Data quality practitioners
  • Data governance specialists
  • System developers and architects
  • Application implementation and adaption consultants
  • Data entry operations staff

3. Modeling master data objects

The main step in MDM – after planning and stakeholder involvement – is to build MDM data model. This step is about knowing:

  • What data assets are core to your business operations?
  • Which information do you really need to preserve about these core data assets?
  • How do these core data assets relate to each other?

Therefore, a data model is simply a graphical or logical representation of all master data objects, their important attributes, and the relationship between them. Preparing such models will support the subsequent steps of data integration, quality, synchronization, and governance.

Example master data object model

master data object model

Let’s go over the main steps involved in data modeling:

a. Identifying master data objects

As mentioned earlier, one of the most significant steps for MDM is identifying master data objects – the data entities that your business operations and transactions usually involve. These normally include (but are not limited to): customers, products, locations, employees, etc.

b. Identifying attributes for master data objects

Once master data objects are identified, now you need to select important attributes for these objects. While making selections, remember to include a uniquely identifying attribute for each data asset. For example, for products, this can be SKUs, or a unique ID for customers, and so on.

In absence of uniquely identifying attributes, you may have to include a combination of attributes that can possibly act as a unique identity when put together.

c. Identifying relationships between master data objects

Now it’s time to define the hierarchy and relationship between master data objects. Normally, following types of relationships can be created between data objects, depending on how business transactions are allowed to happen at a company:

  • One to one
    • Example: One Customer can only have one Location at a time
  • One to many
    • Example: One Customer can make many Purchases
  • Many to one
    • Example: Many Customers can be from one Location
  • Many to many
    • Example: Many Customers can buy many Products.

d. Building the model in MDM

Once these tasks are performed, it is now time to design or build the finalized model into MDM. This ensures that when any new data is loaded or added into MDM’s master data repository, it must conform to the designed data model. This means:

  • Upcoming data records must belong to any of the master data objects modelled.
  • Upcoming values must be valid, standardized, and formatted as defined for each attribute.
  • Upcoming values must conform to the relationships imposed in the designed model.

If these conditions are not met, MDM will throw an error and will not allow data to be stored until it is rectified according to the modelled design.

4. Integrating data into master repository

This step imitates the ETL process (extract, transform, load) for managing data warehouses. In context of MDM, it involves following steps:

a. Connecting

This involves connecting MDM software tool to all sources containing master data (as planned during the initial phase). This may involve connecting to a CRM (for customer information), finance software (for invoices), PCM (for products), HRM (for employees), and so on.

b. Extracting

This involves extracting past records from the connected sources into MDM – but not loading them to the master data repository just yet, this step comes after consolidation.

Extraction is performed so that the past records can be cleaned and merged before they can be loaded to the master data repository. You can also choose to filter the extraction process – by specific time periods or any other attribute. For example, you may want to extract data records dating back to ten years, or maybe just extract the records that were created by a valid source.

c. Consolidating

Once you extract the required data records across all connected sources, it is now time to consolidate them (clean, standardize, match, and merge). Make sure that the consolidated records:

  • Represent a single, unified view of master data
  • Conform to the MDM data model designed during the third phase, otherwise you won’t be able to load them into master data repository.

Since most siloed data applications have numerous data quality issues, it is recommended to follow a suitable data quality framework for consolidation of records – we will talk more about this in the next section.

d. Loading

When records are extracted and consolidated, they are now ready to be loaded into the master repository. In case the data records do not conform to the designed data model, MDM might throw errors during the loading process.

5. Embedding data quality controls

During the integration (consolidation) process, a number of data quality processes are implemented to standardize data records according to the designed model. Moving forward, whenever a connected database is updated, this new change must be migrated to the MDM data repository.

But before this change can be migrated, the updated data must go through a systematic process to ensure fitness of quality. This is why, a continuous data quality process or framework is always made part of the MDM architecture.

This framework usually includes following steps:

  • Data profiling: Assessing the current state of your data and identify cleaning opportunities.
  • Data cleansing and standardization: Performing a variety of data cleansing 3 operations and attain a standardized view across all imported data sources.
  • Data match configuration: Configuring and execute proprietary or industry-leading data matching algorithms, and finetune them according to your data requirements to get optimal results.
  • Data match result analysis: Assessing the match results and their match confidence levels to flag false matches and determine the master record. This may require involvement of data stewards or admins to make the final decision.
  • Data merge and survivorship: Designing merge and survivorship rules to overwrite all poor-quality data fields automatically and retrieve the golden record.

6. Enabling linear data synchronization

Data synchronization requirements solely depend on the chosen architectural style of MDM. MDM hub styles such as coexistence usually require complex synchronization techniques to ensure data is kept up-to-date in MDM as well as all connected source applications.

For the sake of understanding synchronization to full extent, we will mostly focus on coexistence hub style in this section.

An essential part of MDM is its ability to act as an active and intelligent hub that:

  • Serves incoming data requests from connected sources.
  • Provides access to master data repository.
  • Monitors changes being made to any record at connected source.
  • Merges new changes into the master data records, while ensuring data quality.
  • Feeds the updated master data records back to source or other applications.

To ensure smooth data synchronization, an MDM solution must be equipped with the right logic and processing rules, such as:

  • Timeliness: This refers to propagating changes and making updates in a timely manner so that the MDM can be considered as an always-on / always-ready system.
  • Latency: This refers to minimizing the time duration between requesting information at a connected source and when it is finally made available.
  • Consistency: This refers to replicating any/all changes across connected sources. This may depend on your MDM architecture style (whether you keep all connected sources updated or just the MDM).
  • Coherence: This refers to implementing transactions in order of occurrence, such as read/write requests to and from different connected sources.
  • Determinism: This refers to ensuring the same query gives the same results, if executed more than once.

7. Establishing data governance rules

A final – but just as important – part of MDM is data governance. The term data governance usually refers to a collection of roles, policies, workflows, standards, and metrics, that ensure efficient information usage and security, and enables a company to reach its business objectives.

Data governance in MDM is usually seen as the ability to:

  • Create data roles and assign permissions.
  • Design workflows for verifying information updates.
  • Limit data usage and sharing.
  • Collaborate and coordinate in merging multiple data assets.
  • Protect data and conform to compliance standards, such as HIPAA, GDPR, etc.
  • Ensure data is safe from security risks.

The need for master data management strategy

After implementing the MDM process, many organizations still struggle to meet their data KPIs. A lack of MDM strategy is often a culprit behind such problems.

Implementing a full-fledged MDM can be quite complex as it requires a lot of planning and coordinating between teams and stakeholders. To successfully achieve your MDM goals, your plan must be backed up with a strong strategy, or it will get more difficult to sustain over time.

Here, we discuss what an MDM strategy is, why it is important, and which key areas you need to strategize before starting the MDM process.

What is master data management strategy?

A master data management strategy can be defined as:

A collection of best practices that must be integrated into the MDM process to help achieve the desired state of data and sustain it over time to meet long-term data goals.

Where the process focuses on implementing MDM functions (such as data modeling and data governance rules), an MDM strategy is more business-focused and identifies the effort required to bridge the gap between the current data state and how it must be in near future.

Almost every enterprise adopts an MDM solution hoping to make their enterprise data accurate, consistent, and complete. But you must measure how well the data outcomes meet the defined KPIs, and which strategic practices can help you get there sooner.

Why is MDM strategy important?

Designing an MDM strategy is just as important as implementing the process. Otherwise, you might feel like your business is going somewhere with its MDM efforts, but not necessarily know where. An MDM strategy will help you to holistically understand how disparate MDM components are working together to achieve the desired outcome. This, in turn, sets a long-term direction in your mind, enabling you to decide how you can attain your future data goals.

Thus, an MDM strategy allows you to not only implement, but also consistently monitor and pivot your MDM functions whenever the expected results do not meet the mark.

How to plan your MDM strategy?

All of this is easier said than done. You might be still wondering what exactly an MDM strategy looks like and what are some master data management strategy examples. Here, we will look at the most common disciplines that every MDM strategy must incorporate to ensure maximum ROI from MDM efforts.

master data management strategy

1. Focus on long-term data quality issues

Enterprise data is prone to have different types of data quality issues, such as invalid fields, inaccurate information, duplicate records, and inconsistent views. It is important that your data cleansing and standardization strategy does not only focus on current issues, but has a strategic forward-looking approach, where data issues that can possibly occur in the near future are also taken care of.

Many leaders just want to get over with MDM projects that have been stretched out for too long. They end up fixing the issues in an ad hoc manner – not really understanding the core of these issues, what brought them into the system, and what other issues may result in the long-term. Master data management best practices that do not only deal with what is at hand, but also plans for rectifying what may come up in the future.

2. Don’t underestimate leadership buy-in

An MDM plan is incomplete without stakeholder and leadership involvement. One of the reasons why MDM projects are considered to be long-running is that the executive or managerial board is not really convinced about the project value. You may face delays or clarification requests while trying to get approvals on certain matters.

A lack of leadership involvement may also result in backlash from various business units, making MDM execution and maintenance impossible over time.

3. Treat MDM as more than just technology

Usually, MDM is considered to be a technology or a software tool. But it should really be treated as a technical concept that is controlled and strategized by business professionals, with the help of software tools.

The software tool must support MDM operations, such as data modeling, integration, profiling, data quality management, data governance, etc. But it is the responsibility of business professionals at a company to architect the right data solution – not just technically, but strategically – that facilitates the goals and objectives of the business.

For that reason, if you wish to add an MDM to your company’s data infrastructure, you must treat it as a discipline and not just a technology. Meaning, in addition to a full-fledged MDM installation, you must also reevaluate and restructure existing processes that handle and control data at your company. Such an initiative can require quite a lot of planning, coordination, and back and forth between multiple teams. But once you get it right, your business can reap the benefits for years.

4. Make people responsible and accountable

Since the MDM process requires involvement of many people appointed at various roles in an organization, it can get quite overwhelming. When multiple roles are involved in achieving a common outcome, it is always crucial to identify the level of contribution that each role has. Many managers prefer to build data management teams that take care of MDM execution.

This is where a RACI model can be very useful. A RACI model or matrix identifies whether a role is Responsible, Accountable, Consulted, or Informed about the tasks necessary for successful completion of a goal. When it comes to managing data quality, you need to identify the roles that are:

  • Responsible for completing the task.
  • Accountable for delivering the outcomes of the task.
  • Consulted to gain opinions on task completion.
  • Informed about the task’s progress.

5. Think about scalability

Try not to design an MDM model that only works with the current set of data sources, types and formats, and master data assets. Scalability should be a key concern while implementing MDM solutions because you want something that not only works today, but will rest assured work in the upcoming years as well. Companies that think about scalability – in terms of MDM design and architecture – experience a greater chance of success in achieving their goals consistently over time.

6. Keep data governance at the heart of MDM

We discussed establishing data governance rules in the MDM process, but it is important to mention here that data governance is the glue that holds MDM together. Data governance defines how various data assets should be controlled and authorized. Your organization should have data governance policies and standards in place doesn’t matter it has an MDM yet or not. Its significance is clear from the fact that all MDM components require data governance for optimal execution.

7. Define and measure metrics for effectiveness

Another important aspect of MDM strategy is measuring the effectiveness of the process. This helps you to understand how well the designed process and its components are performing. One way to measure the process effectiveness is to screen the cleaned and consolidated data for errors. Depending on what data quality means for your business, you can choose to measure the data characteristics that indicate acceptable data quality levels.

A list of common data dimensions is given below:

  • Accuracy: Are data values in MDM accurate?
  • Lineage: Were the data values updated by authorized sources?
  • Semantic: Are data values true to their meaning?
  • Structure: Do data values exist in the correct pattern and/or format?
  • Completeness: Is there any crucial data attribute missing?
  • Consistency: Does the MDM consistently produce the same results for the same query?
  • Currency: Does the MDM produce data that is acceptably up to date?
  • Timeliness: How quickly does the MDM serve the requested data?
  • Reasonableness: Do data values have the correct data type and size?
  • Identifiability: Does every record represent a unique identity and is not a duplicate?

What is the difference between data quality and master data management?

The terms data quality and master data management are closely related to each other. In fact, data quality is considered to be the main driver as well as the by-product of MDM solutions. When these solutions are packaged and sold as tools, it becomes imperative to know the overlapping areas and capabilities, so that you can choose the right solution for your business.

It is important to understand that data quality and master data management are not opposites of each other; rather, they are complements. MDM solutions contain some extra capabilities in addition to data quality management features.

This definitely makes MDM a more complex and resource-intensive solution to implement – something to consider while choosing between the two approaches.

We have an in-depth guide that discusses their differences in a lot of detail: Data quality versus master data management: Which one do you need?

Closing remarks

Here, we come to an end. We started off by looking at the rising need for systematic and centralized data management solutions, then went on to conceptually and technologically study MDM solutions, and finally ended with a comparison between MDM and DQM approaches. This journey in itself has given you enough information to make the right decision for your business.

It is clear that implementing an MDM is quite a complex process, as it requires in-depth pre-planning and analysis, as well as involvement of key stakeholders. Do keep in mind the budget and time your business can afford and is willing to invest in this process – as compared to return on investment. For example, you may not need a complete and independent system for master data management, and a simple stand-alone data quality tool can fulfill your data consolidation needs.

But if you want, our solution experts can definitely help you answer any question that you may still have. Don’t hesitate to download our free trial or book a demo today for a one-on-one call with our experts.

The definitive buyer’s guide to data quality tools

Download this guide to find out which factors you should consider while choosing a data quality solution for your specific business use case.

Download
In this blog, you will find:

Try data matching today

No credit card required

"*" indicates required fields

Hidden
Hidden
Hidden
Hidden
Hidden
Hidden
Hidden
Hidden
Hidden
This field is for validation purposes and should be left unchanged.

Want to know more?

Check out DME resources

Merging Data from Multiple Sources – Challenges and Solutions

Oops! We could not locate your form.