Lately, I have been working with few clients who’ve been implementing reference data management (RDM) systems. I will be writing a series of blogs sharing my experience working with reference data including challenges involved while implementing RDM solutions, architectural best practices and reference data integration considerations.

This post covers some of the basics.

What is Reference Data?
Reference Data as we know is the data used to categorize other data within applications and data bases. We usually refer to this data as look-up, code table or domain values. These code tables are usually characterized by code and value pairs. Think of it as set of allowed values for a given field.

Some of the best examples are – state codes, country codes, gender codes, marital status type codes etc.

NAICS codes are great example of industry standard reference data sets used for classifying business establishments. For example, according to Federal Statistical Agency in North America, NAICS code for Pizza delivery shops is 722513. Similarly, every business has been given a code and this standardized coding helps in collecting, analyzing and publishing statistical data related to US economy.

Why is it important to manage reference data?
Typically, each application in an enterprise has its own representation of code sets defining the same thing. During integration of master data (or any data) across applications, it’s necessary to translate between the different code table representations in order to categorize data in a consistent way.

Mapping between the different representations and keeping track of changes across all the different code table variations on an ongoing basis can be a major challenge. Many enterprises struggle with this challenge and often use error-prone manual processes to record and manage changes to reference data sets.

Errors in the reference data can have a major business impact. Quality issues in reference data have ripple effect and can cause major issues in the downstream applications. Integrity of the reports you generate are directly proportional to how good is your quality of reference data. More than everything, bad quality of reference data is the most common source of system integration failure.

Key features of Reference Data Management Systems
Earlier I wrote about key functionalities of a Master Data Management System. Here are some of the key features of a Reference Data Management system –

  1. Robust interface controlled by role based security to support collaborated authoring of reference data.
  2. Ability to manage and map relationships between different reference data sets which exist in an enterprise.
  3. Versioning and auditing capability
  4. Hierarchy management
  5. Provision reference data via web services to support SOA framework
  6. Ability to publish reference data
  7. Efficient load/extract functionality
  8. Good error tolerant search capability

In next post I will discuss the challenges organizations face with reference data, and how RDM system can help them resolve these challenges.

Please share your views and experience working with Reference Data via comments. What are the driving forces for your RDM implementation?