Data Catalogs – Governing & Provisioning Data in a Data Driven Enterprise (7 June 2024, Live Streaming Training)
Smart Infrastructure & Smart Applications for the Smart Business – Infrastructure & Application Performance Monitoring
Building a Competitive Data Strategy for a Data Driven Enterprise (17 May 2024, Live Streaming Training)
Data Warehouse Automation & Real-time Data – Reducing Time to Value in a Distributed Analytical Environment
Over the last year or so I have noticed a real surge in companies using or evaluating products to rapidly develop dashboards. In my consulting activities in this area, I have been amazed at the reliance on one particular primary source of data that users have latched onto in dashboard development. This is of course Excel data. While there is nothing unusual about Excel data, it is the trait that users almost ‘prefer’ to access Excel data (because they are familiar with Excel) that I find concerning. Many users seem to either just have these spreadsheets or are downloading data into Excel from a range of data sources including operational systems, flat files (perhaps supplied from some other department or system), data marts and data warehouses. Once data is ‘in the wild’ like this, it takes on a life of its own with people manipulating it and sending it to others via email attachment. It’s like data management just got left behind.
While Excel can never be ignored in any organisation, the increasing demand to analyse Excel data raises questions as to whether or not that data can be trusted especially if you have been sent this data in your email. It brings back the issue that has plagued many companies for years when it comes to Excel. Do you know where the data in the spreadsheet came from? How do you know you have the right version of the spreadsheet? Are spreadsheets managed? Is there other server side data sources that can be accessed from the dashboard tool that would give you more confidence in trusting the data?
With Office Excel 2007 increasing the maximum limit on the number of rows in a spreadsheet from 64000 to 1 million, my concern is that the increasing demand for dashboards will raise the likelihood of million row “spreadmarts” being created all over the organisation by business users rather than pointing dashboards at server side data in a BI system. Only time will tell however policy is clearly needed around spreadsheet management and dashboard development if we are to remain in control of the data and have confidence in it. I would be interested in hearing from many of you out there who are encountering this problem and what you are doing to manage it.