talk-data.com

Topic

apache iceberg

Activities

1

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q2

Top Events

Open Source Data Deep Dive - Santa Clara, CA - 9/18/24 2 IN PERSON: Apache Kafka® x Apache Iceberg™ Meetup 2 ClickHouse Meetup Zurich 1 Come in, it's cold outside 1 REPLAY - Construire la plateforme DATA idéale - meetup OVHcloud du 3 Avril 1 Apache Iceberg Paris Community Meetup #2 1 Autumn Leaves, Data Stays 1 Git for Data: How Table Formats Unify Software and Data Development 1 Apache Iceberg x Apache Kafka x Grafana 1 IN-PERSON! Apache Kafka® Meetup Septembre 1 IN PERSON: Tooling for running Apache Kafka in Production 1 Git for Data: How Table Formats Unify Software and Data Development 1

Top Speakers

JB Onofré (Dremio) 2 Victor Coustenoble (Starbust) 2 alex merced (Dremio) 1 Will Martin (Dremio) 1 Michal Gancarski (GROPYUS) 1 Olena Kutsenko (Confluent) 1 Yingjun Wu (RisingWave Labs) 1 Josh Lee (Altinity) 1 Brad Miro (Google) 1 Maciej Bak (Altinity) 1 weimo liu (Puppygraph) 1 Viktor Gamov (Confluent) 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Git for Data: How Table Formats Unify Software and Data Development ×

Git for Data

2025-10-15 · Git for Data: How Table Formats Unify Software and Data Development

talk

Git

Distributed version control systems - such as Git - unlock software development in multi-player mode: devs can safely work over the same code base, with standard (albeit perhaps not user-friendly!) abstractions for snapshotting, time-travel, and branching. Data folks have rarely been so lucky, as their projects crucially depend on data, whose life-cycle management is often cumbersome and custom. In this talk, we present open formats - such as Apache Iceberg - to practitioners with limited exposure to modern cloud infrastructure. In particular, we show how moving from datasets to tables unlocks a similar multi-player mode when building data pipelines, with equivalent abstractions for snapshotting, time-travel, branching, and a unified backbone for pipelines, data science, and AI use cases.