MDM: The what, why, and how.
Master data management, or MDM, is identified as data composed of objects that provide the foundation for most business transactions. The data in an MDM stays fairly static with incremental changes over time. Data in your MDM should not be transactional but instead should describe the transaction.
Gartner defines master data as the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts.
It sometimes helps to think of master data in terms of nouns describing your business processes. For example, Customers, Products, and Cost Center are all master data objects. While this may seem like a simple rule, it can sometimes be challenging to decide what should be treated as master data.
Bad data is…well…BAD!
There is no denying that data is the lifeblood of every business. Customers, sales, website traffic, opportunities, leads, and the list goes on. We live in a world with an ever-increasing reliance on technology, and this technology only generates more data. In a new ear ushered in by the latest COVID-19 pandemic, many businesses have moved to a work from home culture. This has raised the need for more technology, more data in, and more data out.
With all of that taken into consideration, it is not a far cry from the truth that bad data will result in bad business. Bad data can have far-reaching ramifications, including impacting business growth KPIs that could translate into bad decision making for your business.
Why do I need an MDM strategy?
Much of the data that you might put into an MDM sets the foundation for your business success. When you have multiple versions of data across multiple systems, all supporting multiple business functions, you lose the ability to have a single source of truth. When this happens you lose your ability to build a solid foundation for downstream business success.
Your sales team may see the impacts of having bad data when it comes to placing new orders. An Order means nothing if it doesn’t have a valid customer, billing information, a product with correct pricing, and so forth.
What about your marketing team? On the surface you might wonder how not having a single source of truth could impact marketing. After all, all they do is send emails and make things look pretty, right? This couldn’t be further from the truth. Marketing is critical to growing your business, and they can’t do that with bad data. How can they send that newsletter with the holiday weekend sale if there are five different email addresses for a single customer? Which system is the source of truth for marketing, and which email address is valid?
My First MDM Experience
Feel free to skip over this section past this section. I want to share with you my first experience with MDM and how we came to the realization that we needed a single source of truth.
I was working in healthcare and the company I was with had its business model around health coaching; let us coach you through giving up tobacco, or losing weight, or managing your diabetes, and other similar programs. The coaching not only generated a lot of data, but also relied heavily on existing data coming in. Data was not in short supply for what we did. Our parent company had a large network of insurance and medicare providers providing a constant feed of data into multiple systems. Companies not part of the network, but wanting to offer the programs to their employees, could also feed data directly into systems each month.
It isn’t an overstatement to say we were dealing with big data. Every hospital visit, every interaction with a doctor, every prescription filled, and anything else that touched their insurance was fed into our systems. A lot of data to crunch through and analyze so that we could provide the best possible coaching experience.
In theory, this all sounds transactional. Nothing that is a likely candidate for master data. Except, what happens when a person switches jobs? Every time I’ve switch jobs, I got a new group number, a new member number, and even had to create a new login for the online portal. That was assuming the next company used the same insurance provider as the previous.
In the healthcare and insurance world, this is a new record of data. I’m unsure what it is that keeps, say, Blue Cross from seeing that person from Company A is the same person who now has insurance under Company B (or private insurance). Let’s just say, they don’t do it and it creates problems for downstream consumers.
For our example specifically, we will use the patient who privately pays for coaching. At sign-up, they have Insurance A but then they switch companies and have Insurance B. Instantly, a disconnect happens in the data. The system couldn’t tie the transactions from Insurance A to the person who was now listed on Insurance B. Our coaching team could only go back as far as the start date of Insurance B and no further. They had to rely on good note-taking and the organization of our coaching data.
Insert the need for MDM. How can we take every single patient who comes into the system and create a single patient record for them, all while tying their transactional data back to this one record? Simple enough…we needed a single source of truth. We needed to make the patient a part of master data.
To summarize, for anyone wondering, we created a scoring system to uniquely identify each person walking the earth. Matches on names and birthdays would give a higher score, while things like address or phone number given lower scores. If the score was over 85% we’d create a single record and create references to tie that record back to the individual records that made up that master. This allowed us to daisy-chain transactional data up to the master. For anything lower than the acceptable score, a person manually went in to compare. A positive match created one record and a negative match created two.
In this way we were able to create one patient record for each patient for who we had information. Using the reference table we could cross-walk from a transaction (medicine refill) to the insurance, from the insurance to the individual patient record, from the patient to the reference table, and finally hit the master patient record.
Getting started with MDM.
The first step in establishing a solid MDM system is to start small. You shouldn’t try to boil the ocean because each data object that you decided to bring into your MDM will have unique complexities. These complexities get larger when an object has more records or more data sources. Moving Product data with less than 2000 records is going to be much easier than moving 1 million Customer records.
Defining your strategy.
You should begin by defining a strategy to move data into your MDM environment. Your strategy checklist will ensure you’ve thought about every aspect of your data object before you jump into the integration layer of your project. Below are a few items you should consider for your strategy:
- Identify the owner(s) of the data object. If we look at Customer data as our example, this may be owned by Marketing, Sales, and Customer Service. I like to start by identifying the various systems that hold data related to the object, and then work my way backward to the owners.
- Create a governance committee for the data object. When creating your governance committee, you want to ensure you include your data stewards. The head of Marking, Sales, and Customer Service would be a starting point in the earlier Customer data example. The governing body is going to be responsible for determining how data flows in and out of the MDM environment.
- Develop a model for the data. This step sits primarily with your Data Architect, but they should be seeking input from the governing council for the data object to ensure that all data elements are captured correctly.
While not comprehensive, this is a good starting point for you to build your strategy for each data object. The strategy behind each object does take a long time, but the strategy is so important to ensure you implement the data object correctly. Rushing through this process could lead to dirty data making it into your master data set and ruining your desired outcome.
Implementing a data object into your master data set.
Once you have your MDM environment set up you’ll need to identify a good E.T.L tool. The tool should also be able to handle automated integration and data quality. A good example of this might be Talend or Dell Boomi.
As you extract data, you should give considerations to what needs to be normalized to fit the master data model. Some examples of data you might want to scrub and normalize would be:
- Phone numbers
- Other data points which could vary across regions, data sources, or business units
After you have scrubbed your data, you’ll need to think through deduplication. If you follow the same transformation rules across all of your data sources this is an easy task. Each entry should come out with a normalized record that can be easily matched against your master data set. When you find a match, you’ll want a strategy for updating that data. Otherwise, for new data, you’ll want to simply commit it to the master data set.
We have covered some of the most basic considerations for a successful MDM solution. There are still many aspects of a successful MDM strategy and solution that we did not cover. This is a topic that many data professionals spend years studying and it would take multiple posts to cover completely.
Interested in implementing a solid MDM solution but don’t have the time or resources needed to do so? Our sales team at QuadraByte is standing by to help answer your questions on leveraging our services to do the heavy lifting for you. Contact us today to learn more and get a personalized quote.