How to Implement Provider MDM in 12 Weeks

The Health Insurance industry is faced with regulatory, economic, social and political pressures to control the high cost of care and is looking at various approaches to address these challenges. One of the needs at the center of these transformations is to understand trends and measure performance of their provider network.

Companies often find though that provider information is captured independently in multiple systems with the focus on addressing the needs of a specific system. Key provider attributes such as specialty, qualifications and location addresses are captured differently. There is no overall coordination between teams capturing or consuming provider information across the organization, resulting in duplicative effort and inconsistent/inaccurate information across systems. The organization suffers from lack of holistic understanding of their Provider network and inability to answer critical questions such as which providers are highly utilized vs. ones that are not and what differentiates the highest performing providers.

Provider MDM (Master Data Management) is an ideal foundational solution to enable the organization to address these challenges. With the provider directory information being typically the primary touch point with the member, for payer organizations, provider data quality directly impacts customer experience. However, companies implementing MDM often find that they need to allocate significant budgets, resources and timelines toward the goal of achieving a data domain golden record. Whether it’s a business transformation program, or a point release along an existing MDM journey, these projects can drag on for years and cost millions of dollars. The average cost of an MDM implementation is about $7 million. These projects may involve as many as eight full-time associates and last as long as three years.

However, companies investigating MDM should not be intimidated by these statistics. A streamlined, focused initiative can dramatically reduce these costs and result in far faster “time to value.”

Based on our recent experience, Provider MDM can be achieved at a significantly lower cost and duration than the industry average, without sacrificing the business benefits that come with a well-executed MDM system and process.


This paper describes a recent initiative undertaken by a large health insurance payer seeking to merge provider data from multiple sources into one centralized repository to manage and govern the data for analytical needs. The mastered provider data was then exported to a

Data Warehouse to enable enhanced analytics across a variety of business functions.

By having cleansed, matched and merged provider data, the company expected the following outcomes:

  • The ability to measure member health status, health improvement and intervention management effectiveness.
  • Establishment of a foundational environment that meets immediate, analytic/reporting requirements and can scale to accommodate future population health reporting and analytic requirements.

Our client company’s board of directors had identified these capabilities as strategic imperatives necessary for completion within a short timeframe. The company partnered with Clarity Solution Group to address these needs through a data transformation program.

After completing detailed planning and analysis, Clarity successfully implemented an analytical style Provider MDM solution in just 12 weeks. The Clarity roadmap to accelerated MDM is described in greater detail below.

The following timelines in our quick-turnaround MDM roadmap make the following assumptions:

  • Duration estimates assume activities within phases occur simultaneously
  • Development, testing and project environments are pre-established and tested
  • MDM software, which is well integrated and easily configurable, has been installed and tested
  • Other key software components are already installed and tested

– ETL tools

– Database

  • High-level requirements are finalized and a well-defined manageable subset of providers, such as individual practitioners, is chosen for a first iteration
  • No real-time operational integration is required
  • Low to medium complexity data model (less than 50 attributes other than address data)
  • Minimal complexity for leveraging business rules to create a ‘golden record’

With these prerequisites in place, most enterprises should be able to accelerate their “time to value” with a streamlined, focused MDM implementation that follows a roadmap similar to the one below. While we discuss a single project experience below, the methods are repeatable across MDM domains in the healthcare industry.


2-3 weeks


Achieving a quick-turnaround MDM implementation starts with significant planning. A typical MDM program may include two to three months of planning efforts. This time can be compressed to two to three weeks by staying focused on a distinct data domain and by pre-selecting and setting up a simple MDM tool, such as the Profisee Maestro solution. Before consultants and/or resources are assigned, the following pre-planning tasks should be completed:

  • Schedule team kick-off meeting and send pre-read materials
  • Confirm training logistics with relevant software providers and complete pre-requisites
  • Create a high-level internal plan for work stream and deliverable ownership
  • Finalize third- party agreements as needed
  • Gather list of requested documentation for team review
  • Create project logistics document for team members
  • Complete software training
  • Create preliminary project plan
  • Schedule project plan review meeting
  • Establish project team communication tool(s)
  • Schedule source system specification review meeting(s)
  • Schedule environment specification review meeting(s)

Once the pre-planning effort is complete and the project is officially kicked-off, discovery of data requirements can begin in earnest. The team should focus on understanding every nuance of the data from the source system(s) along with what will be needed for the conceptual data model. With just two weeks allocated to this activity, the team must be focused and pointed in the same direction. In over-reaching MDM programs, the effort to document and understand requirements extends beyond the planned time frame, or is not done comprehensively.

With a quick-turnaround MDM implementation, we recommend keeping focused on a distinct problem area, or sub-domain. This allows the team to spend significant effort on data profiling during the planning stage. Having a quantitative understanding of the data characteristics can dramatically alter design approaches and/or priority of requirements. With these firmly defined in the planning stages, the next phases can proceed more quickly than normal.


2-3 weeks

Iterative Matching

The matching phase in a typical MDM project can consume more than a month, while the team sorts out key dependencies. Even without all the dependencies defined, we recommend starting the entity matching process early. Once data is extracted from the source system(s) and loaded into the

MDM tool, the matching process can begin, using a variety of different scenarios. For the provider data, in the case of the client mentioned earlier, several matching rules were created to identify providers that were the same or those that were unique. Complexities were discovered when key attributes, such as Social Security Number or Provider Type, were missing data or contained unexpected values. Key attributes such as primary specialty were found to be coded differently. Once these complexities were identified, the team was able to quickly and proactively make changes to the design or discuss alternative approaches with the data stewards.

There were approximately nine different matching iterations reviewed by project stakeholders for this engagement. The matching process started by the third week of the project and was completed within three weeks. To stick to an aggressive timeline for the matching phase, we recommend using an MDM tool with configurable matching capabilities that does not require custom scripts or coding.


5-6 weeks

Design and Build

Another key work stream for the MDM implementation is the effort to extract and load data from source system(s) to MDM. During this activity, skilled resources familiar with ETL technologies and principles design a solution that optimizes the data load process based on existing IT standards. For MDM, this is

a critical component. Careful consideration needs to be applied to several areas, including:

  • Data Transformation: Are data validation rules created and maintained in MDM or ETL?
  • Change Data Capture: When data changes in the source system, how are they integrated with MDM?
  • Attribute Reference Values: How are attribute code lists managed, maintained and synchronized between source systems and MDM?

Other key work streams during this phase are configuring MDM to support the data validations, data standardization (names and addresses), survivorship, roles and security, workflow and reporting specifications. The obvious goal here is to avoid software customization to optimize supportability and reduce development and testing time.

This phase normally takes three to six months. With a simplified data set, design can also be simplified. When you are able to finalize model decisions earlier in an MDM program (and, again, when you work with a toolset that eliminates complexity and emphasizes speed), you can squeeze months from the normal design/build phase.

Unit testing can be accomplished during this phase, along with additional iterative matching and results review(s). The test plan should be documented and reviewed by the project team to ensure that the testing effort is organized and that the roles and responsibilities are established.


3-4 weeks

Test and Deploy

Testing should involve a comprehensive validation of both system and end user interactions. System test cases should be based on ETL best practices, including end-to-end environment validation, initial and incremental load testing, traceability documentation and more.

End-user testing is based primarily on business users’ data review and validation. The end-users, usually the data stewards, will check for the following:

  • False positives: Were any records automatically matched that should have been unique?
  • Missed matches: Should any records have been automatically matched based on the data populated?
  • Records for review: Is the match engine threshold set at the right level to accurately identify unique and matched records?

Testing and deployment can take as long as six months in a traditional MDM implementation. In this case, the planning for testing has been addressed in each earlier phase. When an accelerated MDM program can launch after the infrastructure and environments are in place, the program can deploy and achieve value in short order.


Companies that have implemented Provider Master Data Management continue to realize significant benefits. Most medium-sized and large enterprises are somewhere on the MDM maturity curve. Those considering the launch of a new phase of MDM have many considerations to weigh, including IT architecture, business process impact and alignment to enterprise strategy. Primary among these considerations is often cost – both dollars and other resources.

“Clarity helped us deliver value to the business with a streamlined, consolidated process for validating and governing our critical provider data,” said (name, title, and firm). With quick turnaround

MDM for this firm’s provider data, we were able to deliver reliable and actionable information to the business in record time. Benefits include speedier analytics and higher quality data that help reduce claims errors and inefficiencies.

Clarity has demonstrated a proven roadmap to lower-cost and faster time-to-value for MDM programs. New implementation approaches and improved software are lowering the entry barriers to MDM. The next generation of MDM is here, and companies that are ready can improve their competitive advantage.

4 key steps to success with master data management

Master data management is at the heart of business intelligence, big data and information governance. Healthcare organizations that don’t realize the significance of this truth risk expending a lot of effort without fixing root causes of problems.

Effective MDM provides an integrated, 360-degree view of the business by combining processes and tools to create an authoritative “master” source of core information that can be shared across an organization as a strategic asset.

Most organizations have launched one or more MDM initiatives. But research by The Data Warehousing Institute shows that 44 percent of those surveyed approach MDM in silos. This defeats the main purpose of MDM: to break out of a siloed view by integrating and sharing data across the business. For this and other reasons, many MDM efforts fail to provide the integrated view so key to business intelligence. The good news is that the untapped potential of MDM is well within reach.

When an MDM effort falters, it takes a clear-eyed investigation to discover what’s impeding success, which tends to be the result of two issues:

  • Confusion over who owns MDM, which by default becomes the responsibility of IT. Vendors are eager to sell a software solution as the answer. Avoiding the perception of MDM as merely a technology solution is a key factor, but business ownership is essential.
  • Lack of knowledge about what it takes to succeed. Conventional wisdom holds that MDM tools and techniques are mature, and that any company can apply them successfully. However, gathering the tools is the easy part; building effective processes and creating a common understanding is 80 to 90 percent of the effort.

Know your MDM status
Determining where an MDM effort has stalled is the first step in moving forward. Was it during the development of the strategy, business case or roadmap? Was it further on, during process or organizational design? Or did things look good until the actual implementation? A few examples of MDM gone awry:

A U.S. retailer’s third attempt to launch an MDM initiative was mired in debate about which data to include, how to coordinate process and system changes—even how to make decisions. In this case, the technology got ahead of the business processes and organization needed to support it.

At a $25 billion global financial company, where several false starts had generated little but PowerPoint proposals and consulting fees, the internal champions still couldn’t get funding for an MDM initiative that all agreed was “the right thing to do.” No compelling business case had been built to justify the effort.

Name the problem
Once it’s clear where an MDM effort has stalled, take a close look at what’s gone wrong. Some of the most common problems manifest as declining IT delivery performance or stalled initiatives; inconsistent execution and workarounds; ineffective governance; declining data quality; or falling back into old habits.

Find the root cause
What’s causing the problem? Internal cross-functional client teams of business, IT, data management and governance stakeholders need to “peel back the layers” and identify the root cause. Some of the most common causes include:

  • Lack of a clear MDM strategy and roadmap
  • An inadequate or nonexistent business case
  • Differing definitions of core concepts and key terms
  • Imbalance between IT and business ownership
  • Misalignment of incentives and goals
  • Insufficient time or resources
  • Wrong skills or personalities
  • Excessive complexity or bureaucracy
  • Too gradual of a change

In the case of the retailer who had treated MDM as an IT initiative, it found that it needed to involve business stakeholders in decision making. Its early efforts had no standard for evaluating existing data or common understanding of how to define core master data. Even the decision-making process and timeline were poorly defined.

Remediate to get results
Root cause analysis should provide the insight to create and execute an MDM remediation plan that works. Most successful MDM plans: Promote business ownership

  • Run shorter projects to get value and learn quickly
  • Balance objectives with the resources available
  • Devote time to the foundation: definitions and analysis
  • Establish clear responsibilities and metrics
  • Ensure that repetition and reinforcement form habits
  • Prioritize an effective change process

The global financial company with the weak MDM proposal decided to focus initially on a tactical solution that would enhance the benefits of other high-priority initiatives. The increased benefits produced a strong ROI and business case for establishing a dedicated MDM team. Once underway, the company identified other opportunities and MDM became a fundamental part of how the business operates.

Master data management is the foundation for new insight and opportunities through big data, advanced analytics and information governance. But it takes more than technology to succeed. Just as most MDM problems are about people and processes, so are most MDM solutions.

Will Bryant

Will Bryant is a healthcare data management expert for Point B Management Consultants.

Kevin Mackey

Kevin Mackey is Point B’s National Practice Director for Business Technology, Data Analytics and Digital Services.

Agile Master Data Management (MDM)

25-32 minutes

The primary goals of Master Data Management (MDM) are to promote a shared foundation of common data definitions within your organization, to reduce data inconsistency within your organization, and to improve overall return on your IT investment. MDM, when it is done effectively, is an important supporting activity for service oriented architecture (SOA) at the enterprise level, for enterprise architecture in general, for data warehouse(DW)/business intelligence (BI) efforts, and for software development projects in general.  Traditional approaches to data management (DM), particularly those based on extensive modeling and a serial approach to performing the work, have a poor track record in practice. MDM is likely to struggle if you do not move away from traditional DM strategies. In this article I show that agile software development strategies offer significant value for MDM efforts, strategies based on evolutionary development, collaborative approaches to working, and focusing on providing concrete value to the business.     

Agile software development (ASD) is an evolutionary approach which is collaborative and self-organizing in nature, producing high-quality systems that meets the changing needs of stakeholders in a cost effective and timely manner.  MDM and ASD are clearly different things, although they are clearly compatible.  An agile approach to MDM:

  1. Addresses basic MDM activities
  2. Is collaborative
  3. Is embedded within the development process
  4. Is an enterprise activity
  5. Is evolutionary
  6. Is usage-driven
  7. Produces measurable results
  8. Delivers quality through testing
  9. Adopts a lean governance approach
  10. Requires a cultural shift

  1. Agile MDM Addresses Basic MDM Activities

The main differences between “Agile MDM” and “traditional MDM” are centered on the approach to doing the work, not the fundamental work itself; In other words, when you do the work, how you do it, and who you do it with are the critical issues. An agile approach to MDM achieves the goals of MDM (promoting common data definitions, reducing data inconsistency, and improving IT ROI) by embedding MDM activities into the overall software process in a manner which reflects the environment of modern IT departments. The following basic MDM activities are still performed (if and when they make sense) with an agile approach, but as you’ll see they’re accomplished in a more effective and efficient manner:

  • Classify data elements (data classification)
  • Consider data access (data security)
  • Identify pertinent master data elements (MDEs) such as entity types, data elements, associations, and so on.
  • Define and manage metadata pertaining to MDEs, including:
    • Primary source(s) of record for MDEs
    • How systems access MDEs (identifying producers and consumers)
    • Volatility of MDEs
    • Lifecycles of MDEs
    • Value to your organization of individual MDEs
    • Owners and/or data stewards of MDEs
  • Adopt tools, including modeling tools and repositories, to manage MDM metadata

Any agilist reading the above list is likely reeling from the potential for out-of-control bureaucracy surrounding MDM. Considering the past track record of most data management efforts, more on this in a bit, this is a significant concern. As you’ll soon learn in the rest of this article it is in fact possible to streamline MDM efforts so that the value is achieved without the pain of needless bureaucracy, although as you would imagine this will require significant culture shifts in some organizations.

  1. Agile MDM is Collaborative

The best way to deliver value is to work closely with development teams and their stakeholders to ensure that the MDM effort is focused on supporting the creation of business functionality that stakeholders actually need now, not at some undefined point in the future. Traditional, documentation heavy, command-and-control approaches to MDM are often doomed to failure because the MDM program is too tedious for teams to follow.  With a collaborative approach to MDM:The enterprise administrators and enterprise architects are actively involved with working with the teams to support and enhance the MDM efforts.They make it as easy as possible for the development teams to do the right thing by collaborating with them to do so.They do a lot of the “MDM grunt work” which the teams would have otherwise avoided.You work together in face-to-face collaborative working sessions.  These prove to be far more effective than traditional approaches such as formalized meetings, reviews, or functionally distributed teams (where the data specialists work on their own in parallel to the development teams).

It is very easy to claim that you intend to take a collaborative approach to MDM, but a lot harder to actually do so. Traditional data management has a poor track record of working together closely and effectively with development teams, as you see in Figure 1. This chart summarizes the results of two questions asked in Dr. Dobb’s Journal (DDJ)’s 2006 State of Data Management Survey — the first question asked whether development teams find the need to go around their organization’s data group (the majority did) and the second asked why they did so. Interestingly, 25% of the problem was around simple education issues with developers (they need to know who to work with and when to do so) and 75% of the problem rested on the shoulders of the data group (people either found them too difficult to work with, too slow, or simply didn’t offer sufficient value). The point is that if your development teams are currently frustrated with the level of service provided by your organization’s data group then it will be that more difficult for the data group to make inroads into the teams to support any sort of MDM effort.

Figure 1. Reasons why development teams go around data groups.

  1. Agile MDM is Embedded Within the Development Process

If the MDM activities, particularly the ones involving work to identify and capture metadata, are separate from day-to-day development activities then there is very little chance of your MDM program succeeding. The easiest way to embedded MDM activities into your development process is to educate team members on the importance of MDM and to ensure that one or people have the appropriate skills to collaborate with the enterprise administrator(s) and enterprise architect(s) responsible for MDM efforts. If your team has one or more agile DBAs then MDM activities should be part of their daily jobs, and ideally they will have tools which automate as much if this work as possible. 

The challenge is that development teams in general, and in particular agile teams with their focus on high-value activities, will be reticent to do this sort of data-oriented work if they perceive it as extraneous. Worse yet, few development methods explicitly include these sorts of activities, in part because the people behind the methods often lack experience in such activities but mostly because the data community struggles to make their techniques relevant to modern-day development.

  1. Agile MDM is an Enterprise Activity

MDM by definition must have an organization/enterprise-level view, and an agile approach to MDM is no exception. However, that doesn’t mean that MDM has to be an onerous, command-and-control activity which does little more than justify the existence of your data management group for the year or two that they’re able to milk MDM before it fails due to not producing measurable value. Instead, with a collaborative and lean approach your enterprise administrator(s) and enterprise architect(s) can achieve the stated goals of MDM in a sustainable way.  Agile MDM is both a project-level and an enterprise-level activity, and the needs of these two levels will need to be balanced in a manner which reflects your unique situation.

  1. Agile MDM is Evolutionary, Not Serial

The evidence that evolutionary, iterative and incremental, approaches to software development are superior to serial approaches has been mounting for years. This is true of data-oriented activities too, as this site clearly shows. Technically it is quite easy to take an evolutionary approach to IT activities, including data activities, but that often the true challenges prove to be around overcoming cultural challenges.

Not only is it possible to analyze legacy data sources, to collect metadata, and then support development teams in an evolutionary manner you really have no choice in the matter. This is obvious for several reasons:

  1. For all but the smallest organizations you simply can’t do all of the requisite legacy analysis and metadata collection all up front without it changing underneath you before you can make it available. 
  2. The business environment is going to change anyway so you’re going to have to evolve your data definitions over time, like it or not. 
  3. You’re only human and as a result you’re going to make mistakes. You have to assume that your understanding of various data elements will change over time regardless of how much time you actually put into the initial definition efforts. 
  4. The needs and priorities of development teams will change throughout the lifetime of a release of a system, let alone the lifetime of the system itself. This will affect how you prioritize your MDM activities.
  5. If your organization chooses to grow through acquisition or partnership then the new firms that you acquire and/or work with will likely have different viewpoints which will motivate you to evolve your existing perceptions. 

With an evolutionary approach to MDM you want to work in priority order.  This order should be set by the business not by the IT department.  A common Agile strategy, exemplified in development methods such as Open Unified Process, Extreme Programming (XP), and Microsoft Solution Framework (MSF) for Agile, is have the stakeholders prioritize the work to be done, not the IT professionals. This strategy is depicted in Figure 2 and described in detail in Agile Requirements Change Management. This enables you to maximize return on investment (ROI) because you’re always working on the most important functionality required by your stakeholders. Yes, your enterprise architecture and enterprise business modeling efforts will still guide your work, but this guidance will be reflected in the overall prioritization of the work.

Figure 2. Agile requirements change management process.

  1. Agile MDM is Usage-Driven, Not Data-Driven

This is probably the most radical advice which I present in this article – data is a secondary concern for MDM, not a primary one. An IBM study into CRM showed that the primary success factors for CRM were business-oriented and cultural in nature and not technical. Considering that MDM is arguably CRM applied to all major business concepts and not just customers we should really take heed of these findings.  In other words, you must focus on usage, not on data. With a usage-driven approach your major requirements artifacts explain how people will work with, or interact with, the system.  Examples of such artifacts include use cases, user stories, and usage scenarios which are primary artifacts of OpenUP, XP, and MSF for Agile respectively.  Business process models could also arguably be used here, but none of the major agile development methodologies use them as a primary artifact although Agile Modeling includes them as potential models which you should apply where appropriate. When these artifacts are created rigorously they often refer to other types of requirements, such as business rules and report specifications.  However, these sorts of details are often explored on a just-in-time (JIT) model storming basis during the project so many agile teams won’t invest in rigorously documenting them because the useful lifetime of such documentation is very short. The value in usage models, in particular use cases and usage scenarios, is that they focus on the business objectives which end users are trying to accomplish by using your system(s). If your stakeholders are able to prioritize the various usages, then suddenly development teams find themselves in the position of being able to not only deliver something of concrete value, the implementation of the various usages, but if they implement them in priority order then they will maximize stakeholder’s return on investment (ROI) in IT.  A common mistake which often leads to failure is to let technology decisions drive your prioritization strategies. For example, a favorite IT strategy is to work on one legacy system at a time, analyzing and then cataloging the metadata for the entire system. This sort of initial, detailed cataloging effort can take years to accomplish and will more than likely run out of steam long before any concrete results are produced. Another ill-fated strategy is to focus on specific data entities one at a time.  Although this approach has more merit than the previous one, you may find that you need to do this for a large number of entities before you can start providing real business value from your efforts. The fundamental problem is that technical prioritization strategies do not reflect the priorities of the business which you are trying to support, putting any IT effort, including MDM efforts, at risk because your stakeholders aren’t receiving concrete value in a timely manner. When stakeholders don’t perceive the value that they’re getting for their IT investment they quickly start to rethink such investment.

Worse yet, some MDM efforts run aground on the “one truth” shoals – they strive to develop one definition for each data entity within an organization.  In theory this is a laudable goal but in practice it’s virtually impossible because few organizations can actually come to an agreement on the definitions of major concepts. Furthermore, it’s often a competitive advantage for your organization to treat various concepts differently at times based on the given context. A wonderful example of this is HSBC’s series of billboard and airport advertisements around the world showing two different pictures with captions, then showing the same two pictures with the captions swapped.  Figure 3 is a picture that I took in a hallway in London’s Heathrow airport. In short, efforts to try to identify the “one truth” are likely misguided and unlikely to actually produce value. My advice is to worry less about gathering perfect metadata and instead focus on delivering valuable business functionality.

Figure 3. Questioning the “One Truth” philosophy.

  1. Agile MDM Produces Measurable Results

Many traditional IT efforts find themselves in trouble when they take a document-based approach to reporting progress. For example, in earned value management (EVM) you claim progress against your plan when you achieve various milestones called out in those plans. On traditional software development projects these milestones are typically based on delivery of key documentation such requirements specifications, design specifications, test plans, and eventually the working system. Traditional MDM efforts may choose to measure earned value in terms of the metadata collected, such as the number of entity types or entity attributes defined. The challenge to a document-based approach to measuring earned value is that there is a tenuous relationship between documentation and actual delivery of working functionality which actually provides real value to business stakeholders. When you think about it, you’re doing little more than justifying bureaucracy with document-based EVM.

Agile teams “earned value” in the form of a working solution, which for a software development project is the delivery of working software and for a DW/BI project the delivery of analytic data and supporting reports. Therefore, with an agile approach to MDM your focus shouldn’t be on collecting metadata (although you will still do that) but instead should be on:

  1. Supporting project teams to deliver high-quality working software which meets the changing needs of their stakeholders
  2. Supporting business stakeholders to access and manipulate data, typically via a DW/BI solution

In other words, don’t do MDM for the sake of doing MDM, instead do it to streamline stakeholder-facing data-oriented activities. The only valid way of measuring your MDM efforts isn’t by number of data elements collected but instead by number of “data conformant” reports, data conformant web services, or data conformant components delivered by project teams.

Agile software development teams work in priority order, as you saw in Figure 2, and thereby they maximize stakeholder return on investment (ROI) by focusing on delivering the highest value functionality at any given time. If all of your development teams work in this manner, and because agile MDM work is embedded in the development process, you similarly will maximize the ROI on your MDM efforts.

This differs from traditional MDM efforts which try to capture the required metadata in a “big modeling up front (BMUF)” style effort. This is often in the form of a multi-month if not multi-year effort run by a DM project team in parallel to actual software development projects. There are several problems with the traditional approach to MDM:

  1. It can be months, if not years, before tangible results are produced.  Although many organizations believe that they can succeed at long-term efforts such as this, few actually can in practice. Larissa T. Moss points out in Critical Success Factors for MDM that in the past the data community had a very poor track record with similar metadata schemes which had long-term paybacks. 
  2. Immediate efficiencies are forgone. Although the MDM effort may inevitably produce a comprehensive repository of metadata it misses immediate opportunities to provide actual value to the business. If the MDM effort does eventually achieve a positive ROI it will be lower as a result.
  3. Needless work will occur. People are not good at judging up front what they want, we’ve found that when you define detailed requirements specifications early in the development lifecycle nearly half of the identified functionality is never used by end users.  Therefore it is likely that a traditional approach to MDM where you try to comprehensively define the required metadata is equally likely to result in significant wastage.

  1. Agile MDM is Test-Driven

Agile software developers typically take a test-first approach to development, also called test-driven development (TDD) or behavior driven development (BDD), and this is not only possible for data professionals it is highly desirable. With a test-driven approach you write a single test before doing the work to fulfill that test, in effect creating a detailed specification for that functionality before implementing it. Better still, you can run the tests on a regular basis and thereby validate your work in progress. A test-first approach, in combination with other agile testing activities, greatly increases the quality of the work delivered. This shouldn’t come as a surprise – testing as early as you possibly can, and fixing the defects that you do find, and doing so more often, leads to improved quality. 

Traditional teams often take a review-based approach to development, particularly early in the lifecycle when they have no software to work with.  Although better than doing nothing at all, reviews prove ineffective in practice when compared with regression testing when it comes to quality. Reviews have a very long feedback cycle, often weeks if not months, and as a result the costs of addressing defects are much higher than techniques (such as TDD) with shorter feedback cycles. If someone can offer actual value in a review, why not have them involved with the actual work to begin with? In short, reviews often seem to be a stop-gap measure which compensate for poor collaboration or lack of quality focus earlier in the lifecycle. It is far better to address the real problem, hopefully with Agile strategies, than to simply put a band-aid over it and hope for the best. And the numbers clearly show that traditional approaches to data quality are failing in practice – The Data Warehouse Institute (TDWI) reports that data quality problems result in a loss of over $600 Billion annually in the United States.

  1. Agile MDM Adopts a Lean Approach to Governance

Traditional governance often focuses on command-and-control strategies which strive to manage and direct development project teams in an explicit manner. This approach is akin to herding cats because you’ll put a lot of work into the governance effort but achieve very little in practice. Agile/lean data governance focuses on collaborative strategies that strive to enable and motivate team members implicitly. This is akin to leading cats – if you grab a piece of raw fish, cats will follow you wherever you want to go.

An important component of data management is governance of the MDM metadata and of the source data which it represents. My experience is that a traditional, command-and-control approach where the DM group “owns” the data assets within your organization and has a “death-lock” on your databases proves dysfunctional in practice. At best it results in the DM group becoming a bottleneck within your IT department and at worst it results in the development teams going around the DM group in order to get their work done, effectively negating your data governance efforts (some alarming statistics on this in a minute). A better approach is to:

  1. Include data professionals as active participants on development teams.  When your DM group is external to project teams it can foster a “them vs. us” mentality within your IT organization if you’re not very careful. You don’t need to have an external group to run your data governance activities, instead individual data professionals can do so as part of their responsibilities on development teams in a collaborative and timely manner.  This is one of the fundamental concepts of the Agile Data method.
  2. Streamline data standards and supporting activities. When data standards, including master data definitions, are sensible, easy to understand, and easy to access then there is a significantly greater chance that people will actually follow the standards in practice. When you force people to conform to standards, when it make it onerous for them to do so, then you reduce the chance that they will actually do so.  Your data administration efforts need to be based on collaboration and enablement, not command-and-control.
  3. Educate developers. Developers need to understand why your MDM efforts are important, what the benefits are, and how to work together with your DM team. When they know why something needs to be done, and how to do it effectively, chances are much better that they’ll actually do it.

  1. Agile MDM Requires a Cultural Shift

The real challenges with MDM have nothing to do with technology but instead with people. In many organizations there is a significant cultural impedance mismatch that you need to overcome between the data management group and the development teams. This will take time. This mismatch was revealed in the results of the IBM survey into CRM as well as a data management survey performed by Dr. Dobb’s Journal in the Fall of 2006.  The survey found that 66% of respondents indicated the need to go around their data groups at time, and that of those people 75% indicated that they did so because the data groups were too slow to respond to their requests, provided too little real value to the development teams, or were simply too difficult to work with.

The data community must recognize that we can do better than the traditional strategy for MDM, and for data management in general. Although many data professionals prefer traditional, documentation-heavy approaches they must recognize that the rest of the IT community has moved on and have adopted more effective ways of working. An Agile approach to MDM is more effective than a traditional approach, for several reasons:

  1. The traditional data management (DM) track record is poor. If you apply traditional DM strategies to MDM this it is fair to assume that you will experience the same levels of success achieved with Customer Relationship Management (CRM) and metadata repositories in the past. Sadly, an IBM Global CRM Survey of over 370 companies worldwide found that in America, Europe and Asia, 85 percent of companies did not feel that their CRM efforts were fully successful. To be fair perhaps organization’s expectations weren’t realistic, but if your DM group is making similar promises about MDM that were made about CRM a few years ago then you have cause for concern. Furthermore, as Larissa T. Moss points out in Critical Success Factors for MDM, the data community has clearly struggled in the past with similar meta-data schemes. We need to get off the traditional treadmill and start adopting strategies which have a chance of succeeding in practice.
  2. The Agile track record is better. Dr. Dobb’s Journal (DDJ)’s 2007 Project Success Survey showed that agile project teams have a 71.5% success rate compared with 62.8% for traditional teams. Agile enjoys a higher success rate due to its greater focus on return on investment (ROI), it’s increased ability to meet the actual needs of business stakeholders, and its greater focus on quality.
  3. The Agile community leads in DM thinking. The agile data community represents the leading edge of data-oriented techniques.  This community has lead the way in evolutionary/agile data modeling, database refactoring, database testing, database integration, and even agile administration techniques. We’ve addressed many of the issues which have thwarted the traditional community for years, particularly when it comes to data quality.

Master Data Management (MDM), when implemented correctly, can provide significant value to your organization. Unfortunately, our track record with similar efforts in the past, in particular Customer Relationship Management (CRM) and metadata repositories before that, were less than ideal. I believe that you will greatly increase your chance of success by apply agile techniques such as working in an evolutionary manner, taking a usage-driven approach, focusing on measurable results, working collaboratively, delivering quality through testing, and adopting a lean approach to data governance.

Master Data Management (MDM) Is Better in the Cloud

Talend Team | Last Updated: March 29th, 2018

Master data management (MDM) is the process of making sure an organization is always working with, and making decisions based on, one version of current, ‘true’ data—often referred to as a “golden record.”

Sounds simple, but in modern business environments, awash with constant streams of data, master data management may be one of the most complex business challenges. Ingesting data from diverse sources and presenting it as one constant, reliable source for verified, real-time information takes a combination of know-how, tools, and often a strategic partnership.

6 Benefits of Master Data Management

Our increasingly digital world and the explosion of cloud technologies have moved data management to the forefront of the CIO’s potential headache list. Since so much of what happens during any given transaction his unseen tracking, sorting, and verifying background data can be a daunting task. But, effectively managed, here are six real impacts MDM can have on any organization.

  1. Lower total cost of operation

Consider all the aspects of your business that produce and consume data:

  • All applications and their dependencies
  • Employee operations, from production to human resource events
  • Data stores, including hot (working) and cold (archival) information
  • Inventory schedules and ordering protocols
  • And more

A variation in the version or veracity of data from any of these sources can have a chain
reaction impact on all associated info, quickly impact operating expenses, and jeopardize the organization’s business. And it’s usually more of a problem than it initially appears. In one study:

  • Only 3% of the data quality scores could be rated “acceptable” using the loosest-possible standard.
  • On average, 47% of newly-created data records have at least one critical (e.g., work-impacting) error.

Trusted, current data has the inverse impact.

  1. Lower architectural bloat through eliminated redundancies

Decreasing lost business isn’t the only way MDM impacts the bottom line. The cost of running and supporting network architecture—whether onsite, hybrid, or cloud-based—is directly impacted by the amount of resources used. This includes storage space, processing time, and network throughput.

By coalescing the data picture into one, trusted repository the need for individual sources to maintain their own resources is eliminated, and IT operational costs can be cut significantly.

  1. Faster deliveries

MDM is a core consideration for modern development approaches like continuous delivery, DevOps, rugged DevOps, and other design architectures that require shared and reliable data.

With a trusted MDM data reservoir feeding development teams, apps and improvements speed through the delivery pipeline far faster. This means MDM discoveries unearthed today can potentially be put to work in software today, rather than after some extended review and recode process.

  1. Simplified compliance

A major challenge in the modern digital business world is compliance, with regulations like HIPAA, PCI, CIPA, GDPR, and more regulatory frameworks rapidly changing required compliance measures. Compliance alone can be (and is, in larger organizations) a full-time pursuit.

MDM will take the grind out of performing mandatory compliance reports and audience by meeting all standards for verifiable, secure data integration.

  1. Improved customer service

As the saying goes, time is money. In a digital world that moves at the speed of modern business this has never been more true, especially when it comes to your audience’s time. MDM provides a previously unavailable opportunity to interact with your customers during every step of the transaction process—and improve your performance based on real-time feedback—by eliminating inconsistencies and errors that impact product delivery—from first app interaction through shipping, delivery, and feedback.

  1. 360-degree view

A modern, cloud-based MDM process creates a complete, real-time view of each customer. MDM creates a “golden record” that enables marketers to have up to date an accurate information for web personalization or prompts on of things that people bought along with a product you are considering.

  1. Actionable business intelligence

Developing a clear and current picture of all business operations means decision makers can zoom in minutely on problem points, or pull back to a satellite view to see where national or global trends are impacting your business.

Since data is the foundation and life support of digital environments, the implications for MDM in any environment are as limitless as the data itself. If there is one modern technology that proves this daily it’s how organizations are attacking the challenging of MDM in the cloud.

Master Data Management in the Cloud: 4 Key Challenges

With the cloud, and the myriad opportunities it presents, comes a correspondingly large number of pitfalls that can occur with master data management in a public or hybrid cloud environment. Here are four critical challenge areas to address early, remembering that failure to plan is almost planning for failure:

  1. Account for wildly disparate data types. With all the devices, virtual and physical, involved with keeping customers engaged, no one data storage type will be sufficient for MDM. Structured and unstructured data will flow to and through an organization’s management tools, which must be flexible enough to accommodate it.
  2. Security! First, foremost, and always in modern digital environments, security must be the prime directive. If the advantages of MDM stem from a central source of truth from which an organization can operate, then inbound risks and threats targeting the source bring the ability to stop operations dead in their tracks. Hacks, malware, and even cyberansoming can and will be the result of MDM solutions that don’t keep security first.
  3. Governance. If MDM unleashes great potential power, the responsibility for managing it is equally monumental. Though most of the interactions used to present an MDM solution occur automatically in the background, it’s up to business leaders to decide what data is heavily weighted and how to interpret business intelligence. The right governance approach codifies not just the scope of the data but also who keeps/interprets it. This is the difference between just having a central source of truth, and putting it to powerful use.
  4. Expertise. Finding the right mix of experience and eagerness to learn quickly is probably the biggest MDM challenge many organizations face. Because cloud MDM is such an emergent field, most SMBs lack the internal personnel for crafting a holistic solution that’s customized to meet their needs. Training and development versus outsourcing is a decision to address early.

With these challenges in mind, consider which of the most frequently used designs best fits your needs. And just as importantly, your budget.

Four Master Data Architecture Styles

No single MDM strategy fits every need. The advantage of the practice is its flexible, customizable approach to managing and governing your master data repository. That said, there are four general architectures into which initial MDM designs fall.

  1. Registry Style MDM

In this approach, MDM works with abbreviated records, or “stubs,” that detail the data’s source, current location, and more. Registry is the fastest and least expensive architecture to deploy because it minimizes the amount of data actually moving through MDM tools, instead consolidating stubs into a working repository.

The disadvantages of registry include higher latency inherent in gathering and comparing master records with remote device information. Additionally, registry is a one-way collection, and changes made at the master level do not propagate to remote sources, resulting in inconsistencies between master and remote.

  1. Consolidated Style MDM

A consolidated architecture is similar to a registry, but actually moves data from sources to the master repository.

This approach is popular in environments where latency is expected, and consolidation generally takes place during scheduled batch process windows. However, as with the registry style, data in the master repository is not synchronized with downstream sources.

  1. Coexistent Style MDM

This architectural approach takes consolidated MDM a step further and adds the critical step of synchronizing master data back down to the sources, creating a master record that ‘coexists’ in both the prime repository and at the individual system level.

This is a more complex approach and also comes with high latency, as data needs to be collected and disseminated back downstream via separate batch processes. This architecture is common with small and mid-sized companies that can afford to synchronize master data multiple times per defined period.

  1. Transactional Style MDM

The most complete architectural approach, transactional style MDM, is also the most costly in terms of overhead. Master data is migrated from the sources to the master repository, where it is processed, cleaned, standardized, then returned to the sources.

This style reduces latency by direct coordination between master and source, and comes with the advantage of enforcing data governance rules across the enterprise. However, it requires a high level of expertise and the right tools for custom coding to ensure proper flow and prevent flawed data from propagating across the environment.

It’s not uncommon for organizations to begin with one MDM architecture then evolve into another. The measure of a successful MDM build is the efficiency, speed, and consistency with which master data is moved and stored.

Master Data Management and Service-Oriented Architecture

MDM takes on a new significance—and power—in the cloud through its interoperation with service-oriented architecture (SOA). When almost everything, including infrastructure, is virtualized, the costs of inconsistent or corrupt data can be crippling. MDM provides SOAs, including Internet as a Service (IaaS), to work from one source of truth, making enterprise-wide change consistency achievable in near real-time.

A core challenge of MDM in SOA is a data governance approach that standardizes data structure and rules between the repository and the host of remote systems, services, and software. Coordinating a working protocol for exchanging and overwriting data between different systems can be a daunting challenge for existing IT staff. That’s where partnering with a trusted expert simplifies the MDM picture.

Taking the Next Steps with MDM

Every day, MDM in the cloud is changing the speed and reach of business. Achieving a master data management solution allows organizations to close the space between delivered products and users to near real-time, turning data environments into almost living organisms that react and respond to modern business world.

Learn more about MDM and start putting its power to use today!

Mastering data architecture to enable digital transformation

Whether focused on finance, commodities or real estate, organizations are seeking to transform their infrastructures, processes, reporting and customer interactions through digital technologies.

Risk mitigation, the impact of hiring millennials, internal forecasting and reporting, investor and client demands, along with the desire to capture efficiencies while minimizing headcount, are crucial factors in driving adoption and testing of digital technologies. However, with challenges on several levels, many of these digital endeavors fail to get off the ground or achieve what they set out to accomplish.

A big driver of this phenomenon is the ability, or lack thereof, to become a master of data management. Organizations are finding it a challenge to define or develop information architecture (centralized or distributed) that allows for more effective management, storage, reporting and reconciliation of data without the proper information models in place. What may be even more problematic is articulating an enterprise-level plan for achieving related progress.

Unlocking the Power of Information

The information model is not simply data management. It is the gas powering the operating model’s engine, enabling organizations to more effectively communicate and reach their specific goals.

When an information model is properly established, it provides organizations with two distinct benefits. For one, it defines how, where, in what format and for what usage an organization manages and stores data from a technical and systemic point of view, as well as how that translates into personal understanding, utilization and comprehension. That means it is not just solving how internal computers store data, but how internal staff and end-consumers are accessing, visualizing and comprehending data.
Historically, organizations have relied on static information models that tell the enterprise where the information lies, how it is structured and where it lives (i.e., a system of record). Today, a dynamic information model is more desired because it offers greater business value.

A dynamic information model shows where the organization generates data in and out of the company, how data gets through integration layers, and where data is disseminated across the enterprise. If a trade is initiated with basic information, different parts of the organization need to understand who initiated the trade, where it is going, if there were any updates on calculations to ensure reporting is accurate, and whether or not the information is easily accessible for future auditing.

Five Steps to Create Your Target Information Model (TIM)

  1. Granularity

TIM elements should start with high-level information objects. Ask yourself what information is vital to your business. Prioritize them, then iterate for a second and third time. Consider if you need to reprioritize the top three categories. You should end up with items such as customer, product, account, order, invoice and so on.

Mismanaged or poorly implemented information models not only fail to provide value, they also slow down the overall system or process in place. Latent information can be almost as risky as inaccurate or missing information. The right information must be available, unnecessary data must be purged, and cross-system taxonomy translations must be actively maintained to eliminate duplicate information created in parallel. With standards in place, organizations can achieve cross-firm optimization, while provisioning existing data as a value-add to business areas at a lower cost.

The second benefit of an information model relates to permissions to view, write over and read the information. The information model should determine when and who has to provide the information to whom, and if the information predicates someone else’s availability or use of the information. This process includes performing verifications such as technical reconciliations that ensure certain batch processes are completed and operational reconciliations confirm the accuracy of data elements.

The information model further informs what each area does and how they all relate to each other. Without it, organizations can run into several issues, including simple awareness about whether a specific piece of data is captured or available, if the person accessing the information can understand it, if people know where to access the data, or if people accessing the information should actually have permissions to do so. Inefficient data access becomes a major issue for organizations that are overhauling their data taxonomies and information model to meet new regulations (e.g., MiFID II or FRTB) or those relying on real-time sensor data to augment control center operations such as power plants, pipelines or air traffic.

Aligning with a Business Strategy Framework

A target information model (TIM) defines an organization’s data architecture by considering it from different viewpoints, such as:

  • Operations—How, where and when data is used and propagated and by which staff members
  • Governance—Which staff members should have access to specific data and who is responsible for input, verification and oversight
  • Content—Data that must be included, tested, checked, scrubbed and transformed
  • Quality—Accuracy and minimum viability for usage and requirements for validity

  1. Breadth and Depth

At this stage, there is no need to go any further than determining common identifiers. Stick with information objects that are “shared” across two or more business functions.

Don’t believe it when you hear that a “product” for one part of the business is completely different than another part. Although they may not have the same form or identity, it is essential to identify the semantic attributes in the TIM. Your analytics and AI efforts would not be the same without this step. It will require deep business understanding, change management skills and patience.

To address the underlying data management, storage and communication issues, it is important to embrace an overall business strategy framework as a first step. The framework is a set of defined architectural principles that enable change, transformation and growth. These principles include the target operating model (TOM), which illustrates how a business is organized and performs tasks; the target architecture model (TAM), or the systemic architecture that underlines the technology and systems a firm uses to complete tasks; and the TIM, which is shows how a firm stores, manages, comprehends and views the flows of data across the organization.

Not meant to exist in isolation, the TIM prescribes guidelines and principles for an overarching data architecture, and is interconnected with the TOM and TAM. Without defining these principles, it is difficult to create the information model unless there is a clear line of sight into the current state operating model. Without a TIM, an organization cannot define its TAM.

A joint process across business lines can create an information hub and eliminate fragmented systems and processes. The more organizations can master these principles and models, the more they can optimize the speed and sharing of information without sacrificing security, privacy or controls.

  1. Links to Target Architecture Model (TAM)

For each information object, one or more applications or data stores will be identified as its system of record (SOR). This provides an essential originating node for data lineage, enabling visibility to data access by leveraging the TAM-prescribed integration methods (real-time, near real-time or batch).

Philosophical debates around single or central SOR versus multiple or a distributed SOR. Stay at the conceptual level and remember that the “T” in TIM stands for “target”. Define your desired state and trust that your road mapping exercise will give you a path.

Outside of the internal reporting and external regulatory pressures, there are other challenges affecting data flows.

One big pain point for financial institutions and commodities companies is their clients wanting to go digital. Clients want instant access to information, portfolio performance or home energy consumption on their smart phones, on a website and in an email. They want to understand what the information means to them both now and in the future, making the manner in which this information is displayed a vital component. But how can a company deliver information that is both timely and personally meaningful if many of the current processes for distributed insights are batch-driven? How can an institution deliver information that is predictive when they are unsure of the state of the data they are relying on?

Another key challenge is simply the sheer breadth of operations of banks, asset managers, and energy and commodities companies. Most of these organizations span many different lines of business often with multiple subsidiaries that carry out similar work across geographies. In order to support that structure, organizations typically have canonical data models backing each business line or subsidiary. For example, if an organization has one ledger or system, they may have different views of that system to support different areas of the company (e.g., institutional, retail, mortgage lending) across geographies.

Similarly, a commodities trading firm may use multiple trade capture systems for various commodities, such as power and gas in one system, crude oil in another, and financial interest rates in a third system. Or it may have different trading systems for North American and European power that require a comprehensive view of global exposure. How can an organization compile that information for internal analytics (e.g., assets, liabilities, P&L), compliance and auditing if there is not a unified or common view? This becomes an issue of translation and concatenation, leading to a miscomprehension of a company’s overall risk.

  1. Links to Target Operating Model (TOM)

For each information object, one or more individuals will be identified as the “data stewards” providing an essential link for data governance activities like standards and quality metrics. From the process perspective, business processes use information objects as inputs, as elements of calculation or synthesis and as outputs, creating information flows.

Be pragmatic and you will find that, most likely, there are business people already behaving as data owners even though they may not be officially called “data stewards”. Creating the TIM is often a good catalyst to establish solid and sustainable data governance.

Furthermore, in areas like over the counter (OTC) derivatives, securities clearing and collateral, industry bodies such as SWIFT and FPML are introducing new data taxonomies, languages and standards. This requires organizations to only normalize their current proprietary information models but adhere to industry-wide standards. It also requires that organizations can define how and what information should be shared, and when to avoid sharing with the wrong person at the wrong time or with the wrong data.

Defining a TIM requires an understanding of the organization’s specific business needs and challenges. Whether data within the organization is changing frequently, or that data is trapped within multiple underlying vendor systems or off-the-shelf solutions, there are options that make sense depending on the organization’s specific structure and business requirements. These factors will ultimately determine how to implement the information model from both a technical and business requirements perspective to establish a common view of information across the enterprise.

  1. Dynamic View

Combine elements from the TAM (applications and data stores) and the TOM (users and business functions) to create an information flow. This will provide a visual representation of how information flows throughout the enterprise (i.e., where data originates, how it is moved and transformed, and where it is consumed). The common challenge is shortcomings of previous steps like cutting corners with information links.

Joshua Satten

Joshua Satten is a director of the Fintech practice at Sapient Global Markets, a business management consulting firm in Boston.

Nicolas Papadakos

Nicolas Papadakos is a director at Sapient Global Markets, a business and technology consulting firm.