talk-data.com

Topic

Iceberg

Apache Iceberg

table_format data_lake schema_evolution file_format storage open_table_format

Activities

tagged

Activity Trend

39 peak/qtr

2020-Q1 2026-Q1

Top Events

Data Engineering Podcast 65 Data + AI Summit 2025 23 Big Data LDN 2025 13 dbt Coalesce 2025 9 O'Reilly Data Engineering Books 9 Databricks DATA + AI Summit 2023 6 Big Data & AI Paris 2025 5 AWS re:Invent 2024 5 Snowflake World Tour Berlin 5 Google Cloud Next '25 4 The Analytics Engineering Podcast 4 Big Data LDN 2024 4

Top Speakers

Tobias Macey 65 Yingjun Wu (RisingWave Labs) 5 Tom Scott (Streambased) 5 Tristan Handy (dbt Labs) 4 Ryan Blue (Tabular) 4 Adi Polak (Treeverse) 3 Dipti Borkar (Microsoft) 3 alex merced (Dremio) 3 Holly Smith (Databricks) 3 Julien Le Dem (Astronomer) 3 Jean-Baptiste Onofre (Apache Software Foundation) 2 Melvyn Peignon (ClickHouse) 2

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Big Data LDN 2024 ×

Accelerating Data and Analytics Development While Modernising Data Architecture Using AI and Iceberg

2024-09-19 · Big Data LDN 2024

Face To Face

by Mike Ferguson (Big Data LDN)

AI/ML Analytics GenAI Fabric

This session looks at the ever-increasing demand for data and AI, the current challenges slowing development and how companies can overcome these challenges and shorten time to value using generative AI and open tables like Apache Iceberg. It also looks at how this approach makes it possible to transitioning away from siloed analytical systems to a modern data architecture where multiple teams can create reusable data products across multiple clouds and op-premises environments using generative AI in Data Fabric and share that data across multiple analytical workloads.

Charting the Course: The Evolution and Future of Apache Iceberg and Polaris (incubating)

2024-09-18 · Big Data LDN 2024

Face To Face

by Jean-Baptiste Onofre (Apache Software Foundation)

Join us for an in-depth exploration of Apache Iceberg and Apache Polaris (incubating), where we delve into the past, present, and future of these transformative technologies. This session will provide a comprehensive overview of Iceberg's journey, its current role within the data ecosystem, and the promising future it holds with the integration of Polaris (incubating). We will discuss how these technologies redefine table formats and catalog management, empowering organisations to efficiently manage and analyse large-scale data. Attendees will gain valuable insights into the evolving landscape, ensuring they remain at the forefront of innovation and continue to shape thought leadership in the data ecosystem.

Dataset Versioning in the Age of Open Table Formats

2024-09-18 · Big Data LDN 2024

Face To Face

by Tal Sofer

Data Management Delta Git Hudi OTF

Open table formats such as Apache Iceberg, Delta Lake, and Apache Hudi have dramatically transformed the data management landscape by enabling high-speed operations on massive datasets stored in object stores while maintaining ACID guarantees.

In this talk, we will explore the evolution and future of dataset versioning in the context of open table formats. Open table formats introduced the concept of table-level versioning and have become widely adopted standards. Data versioning systems that have emerged more recently, bringing best practices from software engineering into the data ecosystem, enable the management of multiple datasets within a large-scale data repository using Git-like semantics. Data versioning systems operate at the file level and are compatible with any open table format. On top of this, new catalogs that support these table formats and add a layer of access control are becoming the standard way to manage tabular datasets.

Despite these advancements, there remains a significant gap between current data versioning practices and the requirements for effective tabular dataset versioning.

The session will introduce the concept of a versioned catalog as a solution, demonstrating how it provides comprehensive data and metadata versioning for tables.

We’ll cover key requirements of tabular dataset management, including:

Capturing multi-table changes as single logical operations
Enabling seamless rollbacks without identifying each affected table
Implementing table format-aware versioning operations such as diff and merge

Join us to explore the future of dataset versioning in the era of open table formats and evolving data management practices!

The Inevitable Shift to Modern Data Lake Architecture

2024-09-18 · Big Data LDN 2024

Face To Face

by Guy Fighel (Hetz Ventures)

Data Lake Data Management

In the next five years, we are poised to witness a significant transformation towards modern data lake architecture across industries. This shift is driven by an urgent need for a unified, flexible, and scalable data management solution. Such a solution must address the challenges of siloed data environments and the increasing complexity of data sources while balancing the benefits of data mesh principles with centralized governance and semantic consistency.

In this talk, we will cover latest trends and benefits in this field, as well as usage of open formats like Iceberg, lower costs of data movement, & multiple engines to support different workloads that ultimately helps in getting into a single source of truth.