talk-data.com

Topic

Data Lake

big_data data_storage analytics

Activities

tagged

Activity Trend

28 peak/qtr

2020-Q1 2026-Q1

Top Events

Data Engineering Podcast 125 Databricks DATA + AI Summit 2023 43 O'Reilly Data Engineering Books 39 Data + AI Summit 2025 15 Microsoft Ignite 2025 13 Secrets of Data Analytics Leaders 6 AWS re:Invent 2024 6 Big Data & AI Paris 2025 5 Big Data LDN 2025 4 Big Data LDN 2024 3 DATA MINER Big Data Europe Conference 2020 3 Making Data Simple 3

Top Speakers

Tobias Macey 125 Ken Lawson (Microsoft) 4 Kevin Petrie (Eckerson Group) 4 Matt Lowe (Microsoft) 4 Ori Rafael (Upsolver) 3 Gleb Mezhanskiy (Datafold) 3 Al Martin (IBM) 3 Dipti Borkar (Microsoft) 3 Pearl Ubaru (Databricks) 2 Tarush Aggarwal (5xData) 2 Ryan Janssen (Zenlytic) 2 Noritaka Sekiyama (Amazon Web Services (AWS)) 2

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Big Data LDN 2025 ×

Challenges of adding Lake format support to a SQL database

2025-09-24 · Big Data LDN 2025

Face To Face

by Melvyn Peignon (ClickHouse)

Analytics Delta Rust SQL

As organizations increasingly adopt data lake architectures, analytics databases face significant integration challenges beyond simple data ingestion. This talk explores the complex technical hurdles encountered when building robust connections between analytics engines and modern data lake formats.

We'll examine critical implementation challenges, including the absence of native library support for formats like Delta Lake, which necessitates expansion into new programming languages such as Rust to achieve optimal performance. The session explores the complexities of managing stateful systems, addressing caching inconsistencies, and reconciling state across distributed environments.

A key focus will be on integrating with external catalogs while maintaining data consistency and performance - a challenge that requires careful architectural decisions around metadata management and query optimization. We'll explore how these technical constraints impact system design and the trade-offs involved in different implementation approaches.

Attendees will gain a practical understanding of the engineering complexity behind seamless data lake integration and actionable approaches to common implementation obstacles.

Offload Ks of SQL tenants with Debezium and DLTs

2025-09-24 · Big Data LDN 2025

Face To Face

by Marco Santoni (TeamSystem) , Andrea Romeo (Teamsystem)

Azure Delta SQL

How to move data from thousands of SQL databases to data lake with no impact on OLTP? We'll explore the challenges we faced while migrated legacy batch data flows to event-based architecture. A key challenge for our data engineers was the multi-tenant architecture of our backend, meaning that we had to handle the same SQL schema on over 15k databases. We'll present the journey employing Debezium, Azure Event Hub, Delta Live tables and the extra tooling we had to put in place.

Building Streaming Lakehouse Icebergs with Redpanda

2025-09-24 · Big Data LDN 2025

Face To Face

by Paul Wilkinson (Redpanda)

Data Lakehouse ETL/ELT Iceberg Redpanda Data Streaming

In this session, Paul Wilkinson, Principal Solutions Architect at Redpanda, will demonstrate Redpanda's native Iceberg capability: a game-changing addition that bridges the gap between real-time streaming and analytical workloads, eliminating the complexity of traditional data lake architectures while maintaining the performance and simplicity that Redpanda is known for.

Paul will explore how this new capability enables organizations to seamlessly transition streaming data into analytical formats without complex ETL pipelines or additional infrastructure overhead in a follow-along demo - allowing you to build your own streaming lakehouse and show it to your team!

How Databricks does Analytics and a whole lot more?

2025-09-24 · Big Data LDN 2025

Face To Face

by Holly Smith (Databricks)

AI/ML Analytics Databricks Delta DWH Iceberg Spark

So you’ve heard of Databricks, but still not sure what the fuss is all about. Yes you’ve heard it’s Spark, but then there’s this Delta thing that’s both a data lake and a data warehouse (isn’t that what Iceberg is?) And then there's Unity Catalog, that's not just a catalog, it also does access management but even surprising things like optimise your data and programmatic access to lineage and billing? But then serverless came out and now you don’t even have to learn Spark? And of course there’s a bunch of AI stuff to use or create yourself. So why not spend 30 mins learning the details of what Databricks does, and how it can turn you into a rockstar Data Engineer.