Over last few months, there has been an increased discussion around bringing Big Data and MDM together. Likes of EMC, Oracle, IBM, Informatica and Telend have spoken about this topic. 

One common trait you will see in these discussions is about how these two technologies can augment each other by enhancing the information which they handle individually. Clearly, companies are trying to align their products, strategy and marketing teams to meet the foreseeable demand of creating an integrated solution.

Big Data ElephantMDM is referred as a single source of truth for everything related to core customer information. But will we ever be able to know ‘everything’ about customers given lot of data related to them today is huge and unstructured? It might be easy for a human to understand this data to an extent and fetch meaningful insights from it. But human resources are minimal for this task. We need a technology which can analyze all sources of data including social media, channel interactions, sensor data and find cues. And here is where Big Data and Hadoop come in to help!

One of the V’s which is not very often talked about during Big Data discussion is the Veracity. Veracity is about trust worthiness of your data – the degree to which data can be trusted. As the Variety, Velocity and Volume of data increases, you will have more trouble believing how much of it represents truth. For example, you may have tons of information coming from your company’s social media feeds, but not everything in it can be trusted.

One of the best examples I came across in a recent webinar was about analyzing the streaming information generated out of your rental car sensors to find out simple truth – is this customer a safe driver? Depending on the driving pattern of an individual for the duration of rent (sensor data analysis in this case), the rental company can rate the customer on a scale of 1 to 10 and store it in MDM.

So what we end up doing is to manage your master data in MDM (The way we do it today) and then enhance it by fetching useful and trusted information out of Big Data sources. Challenge being, identifying the trusted information in this huge amount of data, we have to have intelligent technologies such as text, sentiment, streaming data analytics to distil useful pieces of information which can enhance the existing master information.

By the way, like everyone else, I have jumped on the Big Data bandwagon. I went deep ocean last week as I attended a Big Data boot camp. Purely looking from an MDM point of view, I am learning a lot about Hadoop, Text Analytics, Streaming Data etc and finding ways to integrate these two technologies in a meaningful way.

More on this topic as I progress. If you are interested, join me in my endeavor by reaching out to me via contact page and do share your thoughts via comments.