Building an Enterprise Data Lake

Posted by Amanda Dascalakis on October 14, 2016

2-3 May 2017 – Rome

31 May – 1 June 2017 – The Netherlands

Most organisations today are dealing with multiple silos of information. These include cloud and on-premises based transaction processing systems, multiple data warehouses, data marts, reference data management (RDM) systems, master data management (MDM) systems, content management (ECM) systems and more recently Big Data NoSQL platforms such as Hadoop and other NoSQL databases. In addition the number of data sources is increasing dramatically especially from outside the enterprise. Given this situation it is not surprising that many companies have ended up managing information in silos with different tools being used to prepare and manage data across these systems with varying degrees of governance. In addition, it is not only IT that is now integrating data. Business users are also getting involved with new self-service data wrangling tools. The question is, is this the only way to manage data? Is there another level that we can get reach to allow us to more easily manage and govern data across an increasingly complex data landscape?

This 2-day seminar looks at the business problems caused by poorly managed information and at the requirements to be able to define, govern, manage and share trusted high quality information in a hybrid computing environment. It also explores a new approach of how IT data architects, business users and IT developers can collaborate together in building and managing an enterprise data lake to get control of your data. This includes introducing a data refinery and information catalog to produce and publish enterprise data services for consumption across your company as well as introducing distributed execution and governance across multiple data stores. It emphasises the need for a common collaborative process and common data services to govern and manage data.

AUDIENCE

This seminar is intended for business data analysts doing self-service data integration, data architects, chief data officers, master data management professionals, content management professionals, database administrators, big data professionals, data integration developers, and compliance managers who are responsible for data management. This includes metadata management, data integration, data quality, master data management and enterprise content management. The seminar is not only for ‘Fortune 500 scale companies’ but for any organisation that has to deal with Big Data, multiple data stores and multiple data sources. It assumes that you have an understanding of basic data management principles as well as a high level of understanding of the concepts of data migration, data replication, metadata, data warehousing, data modelling, data cleansing, etc.

LEARNING OBJECTIVES

Attendees will learn:

  • How to define a strategy for producing trusted data services in a distributed environment of multiple data stores and data sources
  • How to organise data in a distributed data environment to overcome complexity and chaos
  • How to design, build, manage and operate a distributed (or centralised) data lake within their organisation
  • The importance of an information catalog for delivering data-as-a-service
  • How data standardisation and business glossaries can help define the data to make sure it is understood
  • An operating model for effective distributed information governance
  • What technologies they need and implementation methodologies to get their data under control
  • How to apply methodologies to get master and reference data, big data, data warehouse data and unstructured data under control irrespective of whether it be on-premises or in the cloud

Request information on running this seminar as an Onsite

 

Back