Creating Data Products in a Data Mesh, Data Lake or Lakehouse for Use in Analytics (9-10 June 2022, Stockholm)
Data Warehouse Automation & Real-time Data – Reducing Time to Value in a Distributed Analytical Environment
Smart Infrastructure & Smart Applications for the Smart Business – Infrastructure & Application Performance Monitoring
As you probably know, Informatica announced Informatica 9 yesterday in a blaze of publicity with, I am led to believe, over 10,000 people registered to view the announcement. So I thought I would make a few comments on what was announced.
The three main strands of the announcement were:
- Relevant data through business-IT collaboration
- Trustworthy data through pervasive data quality
- Timely data through open SOA-based services
Relevant data through business-IT collaboration includes new Browser-based analyst tools for analysts to directly specify their business requirements, automatic generation of implementation details from business specifications, and a common metadata repository allowing business analysts and IT developers to collaborate and share specification and implementation artefacts with each other.
Pervasive data quality allows data quality rules to be specified once and reused repeatedly, ensuring consistency across applications. In addition, role-based tools are offered to allow stakeholders to take ownership of their own data quality requirements. Data quality scorecards, simple analyst tools and productive developer tools are also available to empower business users, business analysts, data stewards and IT developers to be directly involved in measuring and improving data quality.
SOA data services includes support for
- Information catalog services to enable users to discover relevant data be it on-premise or in the internet cloud
- Logical data objects
- Multi-modal data provisioning services to deliver data in a multiple formats using various protocols such as web services and SQL
- Policy-based data services governance
In my opinion, differentials include policy-based data services governance (which is very unique) and the Business Analyst tools and collaboration support. The web based Business Analyst tools look a very compelling story although I would however have liked to see more integration with Microsoft and IBM Lotus collaborative tools and workspaces.
Data federation and consolidation on the same platform off same metadata with auto generation is a very strong capability. IBM has the same function but auto-generation in their case is out of two separate tools (InfoSphere Data Architect generates data federation logical objects and mappings while InfoSphere Fast-Track generates ETL jobs for Data Stage. Both IBM tools use common metadata however). I would have liked to have seen Informatica go the extra mile and auto generate XSLTs for XML message translation by ESBs/Message Broker products. I don’t see this support but equally I don’t see it anywhere else either as yet. In addition I would have like to have seen MapReduce functionality in the announcement to handle Big Data integration. No doubt this is coming.
With respect to data services, I don’t see ability to publish data services to an Enterprise Service Repository so that these services can be managed centrally in a common place with all other types of service although UDDI support was announced. Some competitors can publish services to ESRs, e.g. IBM with the InfoSphere Services Director. Informatica’s approach to Cloud Data integration also appears seamless but more information is needed. I understand a new announcement coming soon although they have already announced support for running PowerCenter on Amazon’s EC Cloud. In terms of competition, Microsoft can already run SSIS on SQL Azure cloud to integrate cloud data. In addition, IBM also has multi-modal support on InfoSphere Information Server beyond SQL and Web Services. They also support JAVA RMI, REST as well as SOAP, SQL and X/Query.
I would also have liked to see Informatica stick their neck out and acquire a data modelling tool rather than just integrate with everyone else’s products. However, overall, this is a strong announcement with another Cloud announcement to come. There is no doubt that integrated Data Management platforms are here now with Informatica and IBM leading the way with e-Clipse based tool suites. SAP BusinessObjects and SAS DataFlux are clearly not far behind. Expect more from Oracle and Microsoft in 2010.
Looking at the trend here, it is clear that companies need to look seriously at moving from separate data management tools from many different suppliers, each with its own metadata, to single platforms with integrated shared metadata.