Data Lakehouse

From Warehouse to Lakehouse: Our Journey to a Scalable, Decentralised Data Platform

2025-09-25 · Big Data LDN 2025

Face To Face

by Carlos da Silva (Now:Pensions)

Data Governance DWH

In this session, we’ll share our transformation journey from a traditional, centralised data warehouse to a modern data lakehouse architecture, powered by data mesh principles. We’ll explore the challenges we faced with legacy systems, the strategic decisions that led us to adopt a lakehouse model, and how data mesh enabled us to decentralise ownership, improve scalability, and enhance data governance.

The Future of Intelligent Retrieval

2025-09-25 · Big Data LDN 2025

Face To Face

by Adam Nowaczyk (Acaisoft)

AI/ML Data Quality Databricks LLM RAG

Large Language Models (LLMs) are transformative, but static knowledge and hallucinations limit their direct enterprise use. Retrieval-Augmented Generation (RAG) is the standard solution, yet moving from prototype to production is fraught with challenges in data quality, scalability, and evaluation.

This talk argues the future of intelligent retrieval lies not in better models, but in a unified, data-first platform. We'll demonstrate how the Databricks Data Intelligence Platform, built on a Lakehouse architecture with integrated tools like Mosaic AI Vector Search, provides the foundation for production-grade RAG.

Looking ahead, we'll explore the evolution beyond standard RAG to advanced architectures like GraphRAG, which enable deeper reasoning within Compound AI Systems. Finally, we'll show how the end-to-end Mosaic AI Agent Framework provides the tools to build, govern, and evaluate the intelligent agents of the future, capable of reasoning across the entire enterprise.

Building an AI-ready Open Lakehouse on Google Cloud

2025-09-25 · Big Data LDN 2025

Face To Face

by Gareth Williams (Digital Health and Care Wales) , Sadeeq Akintola (Google Cloud)

AI/ML Analytics BigQuery Cloud Computing Data Quality Dataproc GCP Iceberg Fabric Spark

Discover how to build a powerful AI Lakehouse and unified data fabric natively on Google Cloud. Leverage BigQuery's serverless scale and robust analytics capabilities as the core, seamlessly integrating open data formats with Apache Iceberg and efficient processing using managed Spark environments like Dataproc. Explore the essential components of this modern data environment, including data architecture best practices, robust integration strategies, high data quality assurance, and efficient metadata management with Google Cloud Data Catalog. Learn how Google Cloud's comprehensive ecosystem accelerates advanced analytics, preparing your data for sophisticated machine learning initiatives and enabling direct connection to services like Vertex AI.

Lakehouse Architecture Demystified

2025-09-24 · Big Data LDN 2025

Face To Face

by Caleb Benningfield (Amperity)

Join Amperity’s Principal Product Architect to learn more about innovations in the Lakehouse space and how companies are building efficient and durable architectures. This session will include a deep dive into building a composable data ecosystem centered around a lakehouse, followed by reviewing real world application of these concepts through Amperity Bridge, an exciting new technology to plug software solutions into a lakehouse.

Enabling the Agentic Lakehouse: Understanding how Apache Iceberg, Dremio and MCP collide

2025-09-24 · Big Data LDN 2025

Face To Face

by Will Martin (Dremio)

AI/ML Analytics Dremio Iceberg

The modern enterprise is increasingly defined by the need for open, governed, and intelligent data access. This session explores how Apache Iceberg, Dremio, and the Model Context Protocol (MCP) come together to enable the Agentic Lakehouse. A data platform that is interoperable, high-performing, and AI-ready.

We’ll begin with Apache Iceberg, which provides the foundation for data interoperability across teams and organisations, ensuring shared datasets can be reliably accessed and evolved. From there, we’ll highlight how Dremio extends Iceberg with turnkey governance, management, and performance acceleration, unifying your lakehouse with databases and warehouses under one platform. Finally, we’ll introduce MCP and showcase how innovations like the Dremio MCP server enable natural-language analytics on your data.

With the power of Dremio’s built-in semantic layer, AI agents and humans alike can ask complex business questions in plain language and receive accurate, governed answers.

Join us to learn how to unlock the next generation of data interaction with the Agentic Lakehouse.

Revolutionising Iceberg Integration: The Qlik Open Lakehouse Difference

2025-09-24 · Big Data LDN 2025

Face To Face

by Ted Orme (Qlik)

API Iceberg Qlik

Unlock the true potential of your data with the Qlik Open Lakehouse, a revolutionary approach to Iceberg integration designed for the enterprise. Many organizations face the pain points of managing multiple, costly data platforms and struggling with low-latency ingestion. While Apache Iceberg offers robust features like ACID transactions and schema evolution, achieving optimal performance isn't automatic; it requires sophisticated maintenance. Introducing the Qlik Open Lakehouse, a fully managed and optimized solution built on Apache Iceberg, powered by Qlik's Adaptive Iceberg Optimizer. Discover how you can do data differently and achieve 10x faster queries, a 33-42% reduction in file API overhead, and ultimately, a 50% reduction in costs through streamlined operations and compute savings.

Building Streaming Lakehouse Icebergs with Redpanda

2025-09-24 · Big Data LDN 2025

Face To Face

by Paul Wilkinson (Redpanda)

Data Lake ETL/ELT Iceberg Redpanda Data Streaming

In this session, Paul Wilkinson, Principal Solutions Architect at Redpanda, will demonstrate Redpanda's native Iceberg capability: a game-changing addition that bridges the gap between real-time streaming and analytical workloads, eliminating the complexity of traditional data lake architectures while maintaining the performance and simplicity that Redpanda is known for.

Paul will explore how this new capability enables organizations to seamlessly transition streaming data into analytical formats without complex ETL pipelines or additional infrastructure overhead in a follow-along demo - allowing you to build your own streaming lakehouse and show it to your team!

The Future of Intelligent Retrieval

2025-09-24 · Big Data LDN 2025

Face To Face

by Adam Nowaczyk (Acaisoft)

AI/ML Data Quality Databricks LLM RAG

Large Language Models (LLMs) are transformative, but static knowledge and hallucinations limit their direct enterprise use. Retrieval-Augmented Generation (RAG) is the standard solution, yet moving from prototype to production is fraught with challenges in data quality, scalability, and evaluation.

This talk argues the future of intelligent retrieval lies not in better models, but in a unified, data-first platform. We'll demonstrate how the Databricks Data Intelligence Platform, built on a Lakehouse architecture with integrated tools like Mosaic AI Vector Search, provides the foundation for production-grade RAG.

Looking ahead, we'll explore the evolution beyond standard RAG to advanced architectures like GraphRAG, which enable deeper reasoning within Compound AI Systems. Finally, we'll show how the end-to-end Mosaic AI Agent Framework provides the tools to build, govern, and evaluate the intelligent agents of the future, capable of reasoning across the entire enterprise.

talk-data.com

Activity Trend

Top Events

Top Speakers

From Warehouse to Lakehouse: Our Journey to a Scalable, Decentralised Data Platform

The Future of Intelligent Retrieval

Building an AI-ready Open Lakehouse on Google Cloud

Lakehouse Architecture Demystified

Enabling the Agentic Lakehouse: Understanding how Apache Iceberg, Dremio and MCP collide

Revolutionising Iceberg Integration: The Qlik Open Lakehouse Difference

Building Streaming Lakehouse Icebergs with Redpanda

The Future of Intelligent Retrieval