Centralised Data Governance of a Distributed Data Landscape (19-20 October 2022 – Live Streaming Event)
Creating Data Products in a Data Mesh, Data Lake or Lakehouse for use in Analytics (17-18 October 2022 – Live Streaming Event)
Smart Infrastructure & Smart Applications for the Smart Business – Infrastructure & Application Performance Monitoring
Creating Data Products in a Data Mesh, Data Lake or Lakehouse for use in Analytics (26-27 September 2022, Live Streaming Event)
Centralised Data Governance of a Distributed Data Landscape (24-25 October 2022 – Live Streaming Event)
Creating Data Products in a Data Mesh, Data Lake or Lakehouse for use in Analytics (24-25 November 2022 – Amsterdam)
Centralised Data Governance of a Distributed Data Landscape (28-29 November 2022 – Live Streaming Event)
Data Warehouse Automation & Real-time Data – Reducing Time to Value in a Distributed Analytical Environment
Blogging occasionally offers up opportunity to open up a good debate. So here goes! Over the last several years I have observed data models in many different BI systems across different vertical industries where so called ‘generic’ fact tables have been designed with only one ‘generic’ measure. The objective of the design approach is that the measure in the fact table is supposed to hold ANY metric. Often this ‘generic’ measure column is then accompanied by some kind of type field to indicate what the measure actually is (what it means) and some other attribute(s) to indicate the level(s) in various dimension hierarchies that the measure stored is associated with. This helps indicate the additive nature of the metric. Also if it is a monetary measure it may have a currency field and if it is a unit measure it may have a field to explain the kind of units used, e.g. centimeters, litres, cubic metres etc. The stated advantage of these kinds of approaches is flexibility. Adding new measures becomes easy to accommodate as no change to the design is necessary. It is a perfectly good argument and certainly appears widely practiced by designers.
When it comes navigating such designs to develop queries (or even generate them) it is often the case that IT professionals developing reports for the business can figure out how to use retrieve the information required (although even IT developers can struggle). However when it comes to business users developing their own ad hoc queries and reports I frequently see these users really struggling to navigate the ‘flexible’ design first trying to figure out what measures mean, if the measure(s) is/are additive and whatnot. More often than not I see this resulting in real frustration among business users who end up getting aggregations in reports wrong and then start to lose faith in their new BI system. Of course IT steps in to rescue the situation by building more snapshot tables, more materialised views etc. burying generic ‘complexity’ to make the job easier for the user. More often than not these users also often resort to switching back to Excel to hold data outside any data mart so that they can look at data in a form they understand.
Have you seen this in your organisation? If so I want your feedback. Is it the case that so called ’flexible’ design techniques are rope for end users to hang themselves with? My question is this. What is the best way that you see to design fact tables so that business users become productive and can easily understand how to get at the data when building their own reports? I am not so sure that being so generic is of business value. Sure it is flexible. But is it usable? What use is flexible design if a business user cannot understand it and make use of all that valuable data? Is it not better to have multiple metric attributes in a fact table (if multiple metrics are needed) with each attribute name saying what the measure actually is? Let’s have your input!