Chapter 17

Understanding Master Data Management (MDM) Concepts

This chapter covers Objective 5.3 (Explain master data management [MDM] concepts) of the CompTIA Data+ exam and includes the following topics:

  • Images Processes

  • Images Circumstances for MDM

For more information on the official CompTIA Data+ exam topics, see the Introduction.

This chapter focuses on master data management (MDM) concepts. Think of master data as the single source of truth across an organization. Everything depends on how it is managed—from data quality to time-bound decision making to everything else that depends on data. This chapter covers specifics related to MDM from a process perspective and explores where it makes sense for an organization to invest time and effort into MDM.

Note

Use of the terms master and slave is ONLY in association with the official terminology used in industry specifications and standards and in no way diminishes Pearson’s commitment to promoting diversity, equity, and inclusion and challenging, countering, and/or combating bias and stereotyping in the global population of the learners we serve.

Processes

Master data is a dependable set of identifiers that are core to the business operations of an organization. These identifiers describe business data such as:

  • Images Customer details

  • Images Addresses

  • Images Email addresses

  • Images Phone numbers

  • Images Site or location details

Say that an organization is trying to interact with its customers but has incorrect information about who’s in charge of purchasing; imagine the sort of impression that would leave with a customer. If the organization were to keep the master data updated, all other dependent information and records would be updated, and the organization will come across as a very well-informed and customer-focused entity. This is an example of the real power and benefit of keeping master data updated; the way to do this is through master data management (MDM).

ExamAlert

MDM is an important topic, and you should expect a number of questions related directly or indirectly to it on the CompTIA Data+ exam.

There are many ways to look at what MDM does and how it does it. Basically, MDM enables management, organization, categorization, synchronization, and localization of all organizational data. It enables informed, effective, and efficient decision making for various business units, functional areas, and business processes.

Of course, MDM cannot be built in a bubble. A set of processes, technologies, and supporting functions is needed for consistency, quality, precision, and timely decision-making capability for the data being leveraged across multiple organizational applications and databases.

Note

MDM is most effective in organizing and categorizing information so that stakeholders know where the data is when they need it and know that it is being stored and managed in the most efficient fashion.

Figure 17.1 summarizes how MDM encompasses master data and requires interaction across business supporting processes, technology (IT), people, and governance in order to function properly. Further, Figure 17.1 illustrates how MDM impacts various business functions, such as sales, customer service, operations, and marketing.

Images

Figure 17.1 MDM Interaction Setup with Business Functions

Let’s consider an example to highlight the importance of MDM and why it makes sense for organizations to invest in it. Figure 17.2 shows three different records from three business units in one organization, all showing information on one customer.

Images

Figure 17.2 Different Aspects of the Same Customer Across Different Records

If we go across the customer records from the three databases, we can infer that:

  • Images The customer is being identified differently depending on the department (that is, by SFDC ID, customer ID, and service ID).

  • Images The first, middle, and last name fields are similar, although they are not exactly the same across the three databases. The middle name is missing in the marketing and customer service databases.

  • Images The address field is incomplete in the customer service database.

  • Images The email address in the customer service database is different from the email address in the other databases.

  • Images The products are different across the three databases with the marketing database showing products clubbed together as multiple (multi) products and the customer service database showing no entitlement for supporting antivirus.

Based on the same customer records being referenced in different ways, a number of business-focused questions need to be answered:

  • Images Which of these records should be the single source of truth?

  • Images What decisions can be driven by the variety of data available to engage the customer further?

  • Images What type of marketing campaigns would be of interest to this customer?

  • Images What services or product upsell would appeal to the customer?

  • Images Which records should be updated if the address or any other information changes?

Wouldn’t it be nice if a customer could contact your support or sales department and get a representative who knows all there is to know about that customer’s history with the organization at a glance and without looking at multiple systems? MDM makes this scenario possible.

Now, a million $ question—Does every organization require MDM? Well, not really. Setting up and maintaining MDM involves time, effort, and costs. An MDM may cost a couple hundred thousand dollars. The size and complexity of an organization as well as the amount of data the organization consumes might make MDM essential. However, if it is not required or outlined by the data interactions in an organization, MDM isn’t really necessary. Think about an organization that has more than 100,000 employees and serves a number of customers; in this case, MDM would be much more beneficial than it would be for an organization with around 100 employees and a handful of customers.

MDMs can be deployed on-premises or can be cloud based and can use a pay-as-you-go (PAYG) structure.

Consolidation of Multiple Data Fields

MDM can help eliminate duplicate records by merging them together into a single, consolidated record. This would be great for the scenario shown in Figure 17.2 as the customer records across the three different databases could potentially be consolidated into one database that contains true information about the customer and the relationship of the organization with that customer.

During the consolidation process, MDM leverages a central repository known as a hub (or consolidation hub). A hub is a single source of truth where all the master data from multiple data sources is consolidated. Once the data is in a hub, MDM leverages algorithms for cleansing, matching, and merging to come up with a complete single record—the golden record—that is stored in the hub. After this consolidation, all systems and applications will leverage the golden record instead of pulling master data from various systems.

Building on our example of customer records across multiple databases, Figure 17.3 shows how data is consolidated in MDM (into the hub) and what the golden record looks like.

All systems and applications will now leverage the golden record instead of pulling master data from various systems. The golden records may be used for analytics, report generation, or customer interaction.

Images

Figure 17.3 Golden Record Creation in an MDM Hub

Note

Various MDM suites, such as Oracle, Informatica, TIBCO, and SAP, all have some sort of MDM consolidation hub.

Data stewards play an important role in MDM as they can further optimize the output from the algorithms and ensure that the golden records are optimized. (Data stewards were introduced in Chapter 15, “Data Governance Concepts: Ensuring a Baseline.”) Data stewards ensure that proper governance is adhered to when it comes to golden records and look after aspects of life cycle management.

Now, you might ask—how does the field matching and merging process work? The matching process leverages matching rules (or matching groups) to identify similar records across columns and shortlists them for merging. Matching processes might use exact matching or fuzzy matching with similar or nearly similar values. The merging of data fields is driven by the MDM post-matching process, which tags fields with identical data so that values can be merged to arrive at golden records.

At a high level, MDM data consolidation occurs in the following stages:

  1. Data loading and initial check: In this stage, the data is loaded from the master database, and the data is checked against a set of predefined rules to ensure that it is clean.

  2. Data matching: This stage involves matching the data based on matching rules or matching groups to find duplicates.

  3. Finding best records: Best records are created based on matching rules or matching groups.

  4. Merging: Data is merged based on matched field values.

  5. Validation and promotion: Once the newly created records are validated, they are promoted to become golden records.

Finally, after golden records are created, as mentioned earlier, they are used for analysis and business intelligence reporting. Figure 17.4 illustrates this process.

Images

Figure 17.4 Consolidation Process: Golden Record Generation

Standardization of Data Field Names

Organizations typically have standards in place to manage things (including data) in a certain way. An organization that is standardizing data is trying to achieve a consistent and well-defined data format across all business functions. Converting multiple datasets into a common data format helps everyone understand and transact in the same way; however, in the real world, this doesn’t often happen, and many types of data formats coexist. For example, one software vendor might structure its datasets in a unique way; as another example, the fields that different datasets cover may vary.

Note

Data standardization is an ideal state that every organization should strive to achieve. Realistically, however, it takes a lot of time and effort to go through the iterations and set up the process to ensure that all data being used is homogenous and standardized.

Let’s consider an example of how data standardization can bring harmony to the way data is managed in an organization. Consider a sales customer database that captures customer details such as:

  • Images Name

  • Images Address

  • Images Email

  • Images Phone

  • Images Unique identifier

  • Images Dates when transactions happened or will happen (new sale, renewal, and so on)

These fields are visually depicted in Figure 17.5.

Images

Figure 17.5 Sales Customer Database Fields and Format

The organization’s marketing business unit has the database fields and formats shown in Figure 17.6 (which compares the marketing data fields and formats with the sales data fields and formats).

Images

Figure 17.6 Sales vs. Marketing Customer Database Fields and Formats

As you can see, the two databases have fields that differ not only in their names but also in their formats and expected information. This is a classic case of an organization maintaining multiple databases without any uniformity. This lack of uniformity leads to formatting issues and errors due to overlapping or absent data. Data standardization has a number of benefits:

  • Images Creating a single organizationwide view of all data fields

  • Images Enhancing productivity and reducing costs due to data overlapping or errors

  • Images Enabling the organization to maintain a clean and trusted master database that can be governed

  • Images Enabling the same data to be leveraged across the organization

Data standardization is commonly done when onboarding datasets from internal or external databases that are based on varying definitions for fields and/or formats and transforming them into a trustworthy central dataset with common fields and a common data format.

Data standardization can be performed based on predefined business standards using rules or by leveraging third-party tools. While some data standardization might be trivial (such as capitalization of all characters, removing punctuation, or reordering date or time units), some datasets might require creation of rules or algorithms offered by third-party solutions, such as DataLadder, Datamation, and Experian.

Data Dictionary

A data dictionary is a centralized store for metadata (which, you’ll recall, is data about the data). It is not unusual to have very complex database structures and multiple fields across these databases. In such cases, the data dictionary is important because it can explain and expand on information such as the following:

  • Images The names of fields in the databases

  • Images The data types that are stored in those fields

  • Images The roles that have access to the data in these databases

  • Images The relationships between fields across the databases

  • Images The security constraints that are applicable

Hence, data dictionaries can make it simpler to navigate through the tons of data that get processed from multiple sources in MDM.

Figure 17.7 shows a data dictionary developed for sales, marketing, and other sales-focused databases in an organization that sells products and associated services.

Images

Figure 17.7 Data Dictionary Overview

Figure 17.8 shows a database that is built leveraging this data dictionary.

Images

Figure 17.8 Database Built Leveraging Data Dictionary

ExamAlert

Data dictionaries are an important topic. You should expect to see a few questions related to them on the CompTIA Data+ exam.

There are two ways to build data dictionaries, and each method builds a different type of data dictionary:

  • Images Active data dictionary: A data dictionary can be built automatically by a database (or databases) such that the data dictionary and the database(s) remain in sync. This is a major advantage as there is no need to keep the data dictionary updated manually.

  • Images Passive data dictionary: A passive data dictionary is manually built and maintained. It is not referenced to any specific database and contains only reference information across one or more databases. Compared to an active data dictionary, a passive data dictionary requires more maintenance.

Note

Data dictionaries are not directly used by end users; rather, they are used by database administrators.

The key advantages of developing and maintaining data dictionaries are as follows:

  • Images An expanded data dictionary allows for enhanced decision making based on analysis of better-understood information.

  • Images A data dictionary promotes standardization of data and consistency across multiple domains in an organization.

  • Images A data dictionary provides better documentation about data aspects (metadata).

Circumstances for MDM

In the previous section you learned the basics of MDM and multiple aspects of how MDM helps collate information in an organization—working with data coming from multiple sources, working across multiple domains, and benefitting multiple business functions. This section focuses on some of the key scenarios and use cases where MDM deployment can make data management much more effective.

Mergers and Acquisitions

This section sheds light on the topic of mergers and acquisitions and how MDM helps with governance of data and bringing together disparate data sources.

Organizations aim to grow their business reach and their customer base, and this growth can be organic growth (that is, growth via innovation or developing new products or services) or can occur via mergers and acquisitions (M&A), which pertains to merging with or acquiring another organization that can complement the product or service capabilities and therefore help increase the market size. In the case of M&A, an organization may have the right processes to ensure that there’s still a single source of truth after M&A is complete, or it may have to go through the rather painstaking process of finding the right datasets.

ExamAlert

Mergers and acquisitions are very common in real world and data collection and processing can be cumbersome if not managed properly. Be sure to read and understand this topic thoroughly.

For example, consider an example. Say that Organization A develops CRM software and has a good market share. Customers are now expecting new features and an integrated analytics functions, so Organization A looks at acquiring a well-known analytics platform provider, Organization B. This acquisition will give Organization A access to a new client base—and that’s huge. However, after the acquisition, Organization A struggles to formulate a strategy to combine the customer data from Organization B and loses steam.

In this example, MDM would mitigate the problem by integrating with new data sources and creating a single master source that provides a single source of truth for all enterprise-related data, resulting in minimal errors and minimal redundancy in business processes and giving the outcomes expected. The key aspects from an M&A perspective are timely integration and utilization of data. In our example, if Organization A deployed and prepared the MDM solution before the acquisition, it would take a much shorter time for Organization A to realize the benefits and reach the outcomes.

MDM helps integrate data with varying attributes and formats from disparate sources of data; it offers a single unified representation of all information rather than leaving the information in silos. This integration can be achieved by merging data from multiple systems and managing the master data via MDM to improve not just the data usability but also data governance and streamlined data access, as discussed in the following sections.

Compliance with Policies and Regulations

MDM provides unification of data, which directly relates to how an organization can direct its people, processes, and technology to comply with organization policies as well as local and other regulations—with ease. The key aspect is that properly managed and unified data is reliable and trustworthy; therefore, the compliance with regulations is more seamless.

Where there is a unified view of all the organization’s information, regulated data can be viewed the same way across the whole organization. Thanks to MDM, the organization can more easily determine which data can or cannot be disclosed, which data requires extra security, which data needs access restrictions, and which data can or cannot be shared with any external parties.

Further, MDM platforms make it easier to adhere to regulation frameworks (for example, HIPAA), which means an organization doesn’t have to work it out on its own and go about doing many tasks to secure data. For example, a financial organization could leverage MDM to record, store, and submit know your customer (KYC) data to regulators as required.

Streamline Data Access

MDM brings master data to the right audience by streamlining data access. It streamlines access to data insights from MDM golden records as well as access to compliance data for regulators.

Following are some of the key ways that MDM impacts customer interactions:

  • Images MDM offers a cohesive view of customers across the organization, which enables sales, services, marketing, and so on to approach customers effectively.

  • Images MDM improves customer engagement as everyone has the same level of visibility into customer accounts.

  • Images MDM reduces costs for reengaging with customers as a single campaign covers the desired customers without wasting money on duplicate efforts.

  • Images MDM offers access to information across applications and enables automation for repetitive processes.

  • Images MDM offers compliance insights, which would be difficult to obtain on a system-by-system basis.

  • Images MDM improves decision making as updated data is available to drive insights.

What Next?

If you want more practice on this chapter’s exam objective before you move on, remember that you can access all of the Cram Quiz questions on the Pearson Test Prep software online. You can also create a custom exam by objective with the Online Practice Test. Note any objective you struggle with and go to that objective’s material in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset