Smart Infrastructure & Smart Applications for the Smart Business – Infrastructure & Application Performance Monitoring
Data Warehouse Automation & Real-time Data – Reducing Time to Value in a Distributed Analytical Environment
In response to the question from David Jackson (great question by the way) on federated MDM, I have several thoughts on this kind of approach. The question is how these non-overlapping views are managed at different levels. In other words are these virtual views rendered on-demand by an EII tool for example or are they views as per straight forward relational DBMS views on a persistent master data store (sometimes referred to as a hub) that has been established in the enterprise. That is the first question. The second is “are the views using the same data names as the underlying master data, i.e. common data names that would be associated with master data.
Looking at the first example, if these non-overlapping MDM views are virtual and created ‘on-the-fly’ from disparate data sources by federated query EII tools, then EII technology would need to support global IDs and be capable of mapping disparate IDs associated with the master data in disparate systems to these IDs. I am somewhat sceptical of this approach mainly because of the limitations that some EII products place on enterprises. Staying with an EII federated query approach, the next question is what if you wanted to update these non-overlapping virtual views of disparate master data that have been rendered by EII. In that sense, the EII product has to support heterogeneous distributed transaction processing across DBMS and non-DBMS sources. This is supported by some EII tools but certainly not all of them. EII is still primarily use in a read only capacity. I would be very interested in any experiences of companies using EII on its own to manage master data. Please share with us what you are doing out there!! In that sense the registry approach MDM products (e.g. Purisma) may be a more robust way of dynamically assembling data from disparate systems on-demand but again the question is how is the data maintained, i.e. what is the system of entry (SOE). Is the MDM system the SOE as well as a system of record or are the line of business operational systems still SOEs.
If the master data is already integrated and persisted in an MDM data hub, then providing views of this is potentially achievable via straight forward relational views. Again, however, the question of update comes to mind with view updateability. This is a very well documented topic that stretches way back to the ’80s with writings from leading relational authorities such as Dr E.F. Codd and Chris Date. So the processes around maintaining non-overlapping views, which system or systems remain master data SOEs and the technical approach taken to implement this all need to be considered.
For me the bigger question is data names in these MDM views. You could argue that views (virtual or otherwise) allow you to render master data using different data names in every view. To be fair, David’s question stated non-overlapping views. In my opinion the data names in these views should remain as the common enterprise wide data names and data definitions associated with the master data. After all this is MASTER data that should retain common data names and definitions if at all possible. I accept that when subsets of master data are pushed out to disparate applications then the subset of data once consumed by the receiving application may end up being described using application specific data definitions (because that is how data in the application specific data model is defined). But if we are to create views of master data at different levels of the enterprise, in my opinion we should insist on common enterprise wide data names in these non-overlapping MDM views to uphold consistency and common understanding. Any portals that present this data or new applications and processes that consume it should if at all possible retain those common definitions. Again, in my opinion, common understanding and enterprise wide data definitions (i.e. master metadata) is king.
In fact, I would argue that without common data definitions (which is sometimes referred to as a shared business vocabulary) a Federated MDM approach using non-overlapping views would fail because it is the common metadata definitions that hold the whole thing together. This brings up another point and that master data should be marked-up using common data names wherever it goes and that metadata management is just as fundamental to success in any MDM strategy as the data content itself. A shared business vocabulary (SBV) and understanding the mappings between SBV common definitions for master data and the disparate definitions for it in disparate systems is absolutely key.
Let me know what you think and thank you David for a truly excellent question.