Smart Infrastructure & Smart Applications for the Smart Business – Infrastructure & Application Performance Monitoring
Centralised Data Governance of a Distributed Data Landscape (28-29 November 2022 – Live Streaming Event)
Data Warehouse Automation & Real-time Data – Reducing Time to Value in a Distributed Analytical Environment
Creating Data Products in a Data Mesh, Data Lake or Lakehouse for use in Analytics (17-18 October 2022 – Live Streaming Event)
Centralised Data Governance of a Distributed Data Landscape (24-25 October 2022 – Live Streaming Event)
Creating Data Products in a Data Mesh, Data Lake or Lakehouse for use in Analytics (24-25 November 2022 – Amsterdam)
Centralised Data Governance of a Distributed Data Landscape (19-20 October 2022 – Live Streaming Event)
Over the last year or so I have noticed a real surge in companies using or evaluating products to rapidly develop dashboards. In my consulting activities in this area, I have been amazed at the reliance on one particular primary source of data that users have latched onto in dashboard development. This is of course Excel data. While there is nothing unusual about Excel data, it is the trait that users almost ‘prefer’ to access Excel data (because they are familiar with Excel) that I find concerning. Many users seem to either just have these spreadsheets or are downloading data into Excel from a range of data sources including operational systems, flat files (perhaps supplied from some other department or system), data marts and data warehouses. Once data is ‘in the wild’ like this, it takes on a life of its own with people manipulating it and sending it to others via email attachment. It’s like data management just got left behind.
While Excel can never be ignored in any organisation, the increasing demand to analyse Excel data raises questions as to whether or not that data can be trusted especially if you have been sent this data in your email. It brings back the issue that has plagued many companies for years when it comes to Excel. Do you know where the data in the spreadsheet came from? How do you know you have the right version of the spreadsheet? Are spreadsheets managed? Is there other server side data sources that can be accessed from the dashboard tool that would give you more confidence in trusting the data?
With Office Excel 2007 increasing the maximum limit on the number of rows in a spreadsheet from 64000 to 1 million, my concern is that the increasing demand for dashboards will raise the likelihood of million row “spreadmarts” being created all over the organisation by business users rather than pointing dashboards at server side data in a BI system. Only time will tell however policy is clearly needed around spreadsheet management and dashboard development if we are to remain in control of the data and have confidence in it. I would be interested in hearing from many of you out there who are encountering this problem and what you are doing to manage it.