In an earlier post, I discussed about how to identify the right sources of master data during an MDM implementation. I argued that this step is critical factor for realizing MDM benefits early on. 

The other related challenge we face is the time we need to spend understanding these sources of master data. Among many other factors impacting your project schedule, understanding complexly built master data sources can be a daunting task. Some of the challenges we face are-

  • Making sense out of the data residing in sources (Data model, relationship between tables, Codes used to represent an attribute, Date formats etc)
  • Identifying the master data entities
  • Identifying the master data attributes which constitute an entity
  • Getting a grasp on the quality of the data
  • Interpreting the relationships between different entities


These challenges often prove to be bigger threat to the project timeline than one would imagine.

Over the years of implementing MDM, I have learned few steps which prove to be of great help –

  1. Involve source system experts (subject matter experts, business analysts and technical specialist) from the beginning of the project. These personnel bring in-depth knowledge of source systems crucial to the project.
  2. Make sure your statement of work clearly identifies the involvement of these experts so their availability to the MDM project is well planned.
  3. Carefully analyze the master data attributes and create an elaborated source-to-target mapping document. This document should cover –
    • Name of the source table and column
    • Name of the target table and column
    • List of possible values in case of lookup or reference attributes
    • Transformation logic involved during the load process (both at ETL level and at the service layer)
  4. Find out mechanism for pulling the master information from source systems in an efficient manner using tools such as Change Data Capture. These tools can replicate data in real time without impacting the performance.
  5. Relationship between different master data entities can be complex. Make sure the cardinality and optionality are well understood and documented. This helps in creating a suitable MDM data model and avoid re-work later.

Often times, master data sources are legacy in nature, built and maintained over a long period of time, lack documentation and include procedures and terminologies which are no longer relevant in the current context
Often times, we deal with master data sources which are legacy in nature. These systems are built and maintained over a long period of time, lack documentation and include procedures and terminologies which are no longer relevant in the current context. To avoid these factors hindering the progress of your project, make good use of time given to you by source system experts. Create comprehensive documents with this knowledge which can be referenced by your team building ETL and MDM artifacts.

I am sure many of you have treasure trove of experiences, observations, and knowledge dealing with complex sources of master data. Please share your thoughts via comments.