talk-data.com

Topic

Flink

Apache Flink

stream_processing batch_processing big_data

Activities

tagged

Activity Trend

7 peak/qtr

2020-Q1 2026-Q2

Top Events

Data Engineering Podcast 21 O'Reilly Data Engineering Books 15 Databricks DATA + AI Summit 2023 8 DATA MINER Big Data Europe Conference 2020 6 Data + AI Summit 2025 5 AWS re:Invent 2024 4 PyData Amsterdam 2025 2 Data Council 2023 2 Airflow Summit 2022 2 Airflow Summit 2025 2 PyData London 2025 1 Microsoft Ignite 2023 1

Top Speakers

Tobias Macey 21 Olena Kutsenko (Confluent) 3 Gunnar Morling (Decodable) 2 Julien Le Dem (Astronomer) 2 Fabian Hueske (Data Artisans) 2 Tathagata Das (Databricks) 2 Denny Lee (Databricks) 2 Ellen Friedman 2 Prakash Nandha Mukunthan 1 Eric Sammer (Decodable) 1 Francois Garillot 1 Willy Lulciuc (WeWork) 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data + AI Summit 2025 ×

Sponsored by: Confluent | Turn SAP Data into AI-Powered Insights with Databricks

Extending the Lakehouse: Power Interoperable Compute With Unity Catalog Open APIs

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Tathagata Das (Databricks) , Michelle Leon (Databricks)

API Data Lakehouse DuckDB Iceberg Cyber Security Spark Trino

The lakehouse is built for storage flexibility, but what about compute? In this session, we’ll explore how Unity Catalog enables you to connect and govern multiple compute engines across your data ecosystem. With open APIs and support for the Iceberg REST Catalog, UC lets you extend access to engines like Trino, DuckDB, and Flink while maintaining centralized security, lineage, and interoperability. We will show how you can get started today working with engines like Apache Spark and Starburst to read and write to UC managed tables with some exciting demos. Learn how to bring flexibility to your compute layer—without compromising control.

Scaling Identity Graph Ingestion to 1M Events/Sec with Spark Streaming & Delta Lake

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Akanksha Nagpal (Adobe) , Jianmei Ye (Adobe, Inc.)

AWS Azure CDP Databricks Delta Spark Data Streaming

Adobe’s Real-Time Customer Data Platform relies on the identity graph to connect over 70 billion identities and deliver personalized experiences. This session will showcase how the platform leverages Databricks, Spark Streaming and Delta Lake, along with 25+ Databricks deployments across multiple regions and clouds — Azure & AWS — to process terabytes of data daily and handle over a million records per second. The talk will highlight the platform’s ability to scale, demonstrating a 10x increase in ingestion pipeline capacity to accommodate peak traffic during events like the Super Bowl. Attendees will learn about the technical strategies employed, including migrating from Flink to Spark Streaming, optimizing data deduplication, and implementing robust monitoring and anomaly detection. Discover how these optimizations enable Adobe to deliver real-time identity resolution at scale while ensuring compliance and privacy.

Apache Iceberg with Unity Catalog at HelloFresh

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Max Schultze (HelloFresh) , Adam Komisarek (HelloFresh)

Data Lakehouse Databricks Delta Iceberg Snowflake Spark

Table formats like Delta Lake and Iceberg have been game changers for pushing lakehouse architecture into modern Enterprises. The acquisition of Tabular added Iceberg to the Databricks ecosystem, an open format that was already well supported by processing engines across the industry. At HelloFresh we are building a lakehouse architecture that integrates many touchpoints and technologies all across the organization. As such we chose Iceberg as the table format to bridge the gaps in our decentralized managed tech landscape. We are leveraging Unity Catalog as the Iceberg REST catalog of choice for storing metadata and managing tables. In this talk we will outline our architectural setup between Databricks, Spark, Flink and Snowflake and will explain the native Unity Iceberg REST catalog, as well as catalog federation towards connected engines. We will highlight the impact on our business and discuss the advantages and lessons learned from our early adopter experience.

How an Open, Scalable and Secure Data Platform is Powering Quick Commerce Swiggy's AI

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Vasan Vembu Srini (Databricks) , Akash Agarwal (Swiggy)

AI/ML Analytics Data Lakehouse Databricks Delta GenAI Kafka Spark Data Streaming

Swiggy, India's leading quick commerce platform, serves ~13 million users across 653 cities, with 196,000 restaurant partners and 17,000 SKUs. To handle this scale, Swiggy developed a secure, scalable AI platform processing millions of predictions per second. The tech stack includes Apache Kafka for real-time streaming, Apache Spark on Databricks for analytics and ML, and Apache Flink for stream processing. The Lakehouse architecture on Delta ensures data reliability, while Unity Catalog enables centralized access control and auditing. These technologies power critical AI applications like demand forecasting, route optimization, personalized recommendations, predictive delivery SLAs, and generative AI use cases.Key Takeaway:This session explores building a data platform at scale, focusing on cost efficiency, simplicity, and speed, empowering Swiggy to seamlessly support millions of users and AI use cases.