flink

From Crypto Streams to AI-Powered Predictions

2025-12-01 · Crypto Streams to AI Predictions: Apache Kafka®, Apache Flink® & Apache Iceberg®

workshop

by Olena Kutsenko (Confluent)

DuckDB Kafka apache iceberg confluent cloud tableflow

In this 2-hour hands-on workshop, you'll build an end-to-end streaming analytics pipeline that captures live cryptocurrency prices, processes them in real-time, and uses AI to forecast the future. Ingest live crypto data into Apache Kafka using Kafka Connect; tame that chaos with Apache Flink's stream processing; freeze streams into queryable Apache Iceberg tables using Tableflow; and forecast price trends with Flink AI.

Data pipeline troubleshooting: Root cause analysis with lineage metadata

2025-11-27 · Data pipeline troubleshooting: Root cause analysis with lineage metadata

talk

by Mario Fiore Vitale (Debezium)

debezium marquez openlineage

Remember when debugging streaming data pipelines felt like playing detective at a crime scene, where the evidence kept shifting? Well, grab your magnifying glass because we’re about to turn you into Sherlock Holmes of the streaming world. We’ll simulate a disruptive change in an order processing pipeline that captures database changes with Debezium, processes them through Apache Flink, and tracks lineage metadata with OpenLineage and Marquez.

On-the-Fly State Migration: Keeping Your Flink Pipelines Streaming

2025-11-13 · Tides of Change: Real-Time Flow with Postgres, Kafka & Flink

talk

by Csanád Bakos (Vinted)

state processor api

While upgrading Flink to its latest versions to enable more AI-related capabilities, one can easily run into tricky savepoint incompatibilities that render existing state snapshots unusable for recovery. This is especially problematic in the case of pipelines with large state. In such cases, doing a backfill can take too long and using the State Processor API leads to downtime or breaking the exactly-once delivery guarantee.

In this talk, I’ll share a state migration pattern that I applied to one of our Flink jobs using regular streaming mode. It involves creating a new stateful operator that conforms to the new requirements, allowing for compatible savepoint creation. Leveraging side outputs and custom key traversal the existing state is forwarded to the new operator. In the meantime, regular processing is uninterrupted.

We’ll explore the core problem and understand the pitfalls and trade-offs of existing solutions such as the State Processor API. Then, a deep-dive into the migration pattern will follow: ensuring correct state handoff between operator versions, setting up triggers to migrate all keys and other technicalities. Lastly, a few words about cleaning up seamlessly. With this session I will add a nice pattern to your toolbox that you can easily apply next time you run into state migration challenges.

The Real-Time Data Journey: Connecting Flink, Airflow, and StarRocks - Exploring how modern streaming tools power the next generation of analytics

2025-11-13 · Tides of Change: Real-Time Flow with Postgres, Kafka & Flink

talk

by Nicoleta Lazar (Fresha)

Airflow starrocks pipes

At Fresha, we became the pioneers that put StarRocks to test in production for realtime analytical workloads. But one of the first challenges we faced was getting all the data there reliably and efficiently. We had to think about historical data, and realtime data and orchestrate all of that, such that we can move fast, without breaking too many things. Our tools of choice: Airflow, StarRocks Pipes, Apache Flink. In this talk, I’ll share how we built our data pipelines using Apache Flink and Airflow, what worked and what didn’t for us. Along the way, we’ll explore how Flink helps ensure data consistency, handles failures gracefully, and keeps our real-time workloads running strong.

The lifetime of a write, 3 ways: in Postgres, Kafka and Flink

2025-11-13 · Tides of Change: Real-Time Flow with Postgres, Kafka & Flink

talk

by Celeste Hogan (Snowflake)

Kafka postgresql

Kafka and Flink tend to get lumped in as "data services", in the sense that they process data, but in comparison to traditional databases they differ quite dramatically in functionality and utility. In this talk, we'll run through the lifetime of a write in Postgres to establish a baseline, understanding all the different services that data hits on its way down to the disk. Then we'll walk through writing data to a Kafka topic, and what 'writing' (or really, streaming) data to a Flink workflow looks like from a similar systems perspective. Along the way, we'll understand the key differences between the services and why some are more suited to long-term data storage than others.

Delta Force: What the Fuss with Fluss in Flink 2.x

2025-10-23 · IN PERSON: Tooling for running Apache Kafka in Production

talk

by Anton Borisov (Fresha)

deltajoin fluss

The next generation of streaming isn't about faster pipelines, but about smarter connections. DeltaJoin, a new operator in Apache Flink, reimagines stream joins by moving from brute-force state to change-driven computation. Paired with Fluss, Flink's purpose-built storage layer, it enables systems that are real-time, scalable, and cost-efficient. Anton will show how DeltaJoin and Fluss shift streaming architecture from ephemeral flows to durable, queryable state that bridges real-time processing with lakehouse patterns. Drawing on production experience, he'll demonstrate how these innovations reduce join costs, simplify architectures, and unlock new possibilities for real-time analytics. Attendees will leave with a vision of Flink 2.x as the backbone for event-driven systems and modern data platforms.

Delta Force: What the Fuss with Fluss in Flink 2.x

2025-10-23 · Message Tracking, Fluss in Apache Flink 2.x, & Kafka-to-Iceberg Transformation

talk

by Anton Borisov (Fresha)

deltajoin fluss lakehouse real-time streaming

The next generation of streaming isn't about faster pipelines, but about smarter connections. DeltaJoin, a new operator in Apache Flink, reimagines stream joins by moving from brute-force state to change-driven computation. Paired with Fluss, Flink's purpose-built storage layer, it enables systems that are real-time, scalable, and cost-efficient. Anton will show how DeltaJoin and Fluss shift streaming architecture from ephemeral flows to durable, queryable state that bridges real-time processing with lakehouse patterns. Drawing on production experience, he'll demonstrate how these innovations reduce join costs, simplify architectures, and unlock new possibilities for real-time analytics. Attendees will leave with a vision of Flink 2.x as the backbone for event-driven systems and modern data platforms.

Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry

2025-09-23 · IN PERSON: Apache Kafka x Apache Flink

talk

by Mehreen Tahir (New Relic)

Kafka New Relic jaeger java spring-boot opentelemetry

By leveraging tools like Jaeger and New Relic, we will uncover how to gain a full view of your microservices, even in the face of Apache Kafka's asynchronous nature. Join us for a live demo with a simple Java Spring-Boot app, where we will walk through both automatic and manual instrumentation to capture rich telemetry. We will also touch on infrastructure-level observability, pulling metrics and traces from Apache Kafka brokers and Apache Flink.

Stream All the Things - Patterns of Effective Data Stream Processing

2025-09-23 · IN-PERSON: Apache Kafka® x Apache Flink® Meetup

talk

by Adi Polak (Treeverse)

data streaming

Data streaming is a really difficult problem. Despite 10+ years of attempting to simplify it, teams building real-time data pipelines can spend up to 80% of their time optimizing it or fixing downstream output by handling bad data at the lake. All we want is a service that will be reliable, handle all kinds of data, connect with all kinds of systems, be easy to manage, and scale up and down as our systems change. Oh, it should also have super low latency and result in good data. Is it too much to ask?

In this presentation, you’ll learn the basics of data streaming and architecture patterns such as DLQ, used to tackle these challenges. We will then explore how to implement these patterns using Apache Flink and discuss the challenges that real-time AI applications bring to our infra. Difficult problems are difficult, and we offer no silver bullets. Still, we will share pragmatic solutions that have helped many organizations build fast, scalable, and manageable data streaming pipelines.

Real-Time Manufacturing Insights with Apache Flink and Kafka

2025-07-03 · IN-PERSON: Apache Flink® Meetup

talk

by Oded Nahum (Ness) , Laurentiu Bita (Ness)

Grafana Kafka postgresql

In this session, we’ll walk through how Apache Flink was used to enable near real-time operational insights using manufacturing IIoT Data sets. The goal: deliver actionable KPIs to production teams with sub-30-second latency, using streaming data pipelines built Kafka, Flink and Grafana. We’ll cover the key architectural patterns that made this possible, including handling structured data joins, managing out-of-order events, and integrating with downstream systems like PostgreSQL and Grafana. We’ll also share real-world performance benchmarks, lessons learned from scaling tests, and practical considerations for deploying Flink in a production-grade, low-latency analytics pipeline. The session will also include a live demo

If you're building Flink-based solutions for time-sensitive operations—whether in manufacturing, IoT, or other domains—this talk will provide proven insights from the field.

DISCLAIMER We don't cater to attendees under the age of 18. If you want to host or speak at a meetup, please email [email protected]

Real-Time Manufacturing Insights with Apache Flink and Kafka

2025-07-03 · IN-PERSON: Apache Flink® Meetup

talk

by Oded Nahum (Ness) , Laurentiu Bita (Ness)

Grafana Kafka postgresql

In this session, we’ll walk through how Apache Flink was used to enable near real-time operational insights using manufacturing IIoT Data sets. The goal: deliver actionable KPIs to production teams with sub-30-second latency, using streaming data pipelines built Kafka, Flink and Grafana. We’ll cover the key architectural patterns that made this possible, including handling structured data joins, managing out-of-order events, and integrating with downstream systems like PostgreSQL and Grafana. We’ll also share real-world performance benchmarks, lessons learned from scaling tests, and practical considerations for deploying Flink in a production-grade, low-latency analytics pipeline. The session will also include a live demo

If you're building Flink-based solutions for time-sensitive operations—whether in manufacturing, IoT, or other domains—this talk will provide proven insights from the field.

Spinning up an Event Streaming Environment from Scratch Confluent Cloud, Terraform, Connectors and Flink in Action

2025-07-03 · IN-PERSON: Apache Flink® Meetup

talk

by Sven Erik Knop (Confluent)

Kafka Terraform confluent cloud connectors ksqldb

According to Wikipedia, Infrastructure as Code is the process of managing and provisioning computer data center resources through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. This also applies to resources and reference data, connector plugins, connector configurations, and stream processes to clean up the data.

In this talk, we are going to discuss the use cases based on the Network Rail Data Feeds, the scripts used to spin up the environment and cluster in the Confluent Cloud as well as the different components required for the ingress and processing of the data.

This particular environment is used as a teaching tool for Event Stream Processing for Kafka Streams, ksqlDB, and Flink. Some examples of further processing and visualisation will also be provided.

Spinning up an Event Streaming Environment from Scratch Confluent Cloud, Terraform, Connectors and Flink in Action

2025-07-03 · IN-PERSON: Apache Flink® Meetup

talk

by Sven Erik Knop (Confluent)

Terraform confluent cloud connectors

According to Wikipedia, Infrastructure as Code is the process of managing and provisioning computer data center resources through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. This also applies to resources and reference data, connector plugins, connector configurations, and stream processes to clean up the data.

In this talk, we are going to discuss the use cases based on the Network Rail Data Feeds, the scripts used to spin up the environment and cluster in the Confluent Cloud as well as the different components required for the ingress and processing of the data.

This particular environment is used as a teaching tool for Event Stream Processing for Kafka Streams, ksqlDB, and Flink. Some examples of further processing and visualisation will also be provided.

One Does Not Simply Query a Stream

2025-04-14 · IN PERSON: Apache Kafka® x Apache Iceberg™ Meetup

talk

by Viktor Gamov (Confluent)

ClickHouse Kafka SQL Trino apache iceberg kafka streams postgresql tableflow

Streaming data with Apache Kafka® has become the backbone of modern day applications. While streams are ideal for continuous data flow, they lack built-in querying capability. Unlike databases with indexed lookups, Kafka's append-only logs are designed for high throughput processing, not for on-demand querying. This necessitates teams to build additional infrastructure to enable query capabilities for streaming data. Traditional methods replicate this data into external stores such as relational databases like PostgreSQL for operational workloads and object storage like S3 with Flink, Spark, or Trino for analytical use cases. While useful sometimes, these methods deepen the divide between operational and analytical estates, creating silos, complex ETL pipelines, and issues with schema mismatches, freshness, and failures.\n\nIn this session, we’ll explore and see live demos of some solutions to unify the operational and analytical estates, eliminating data silos. We’ll start with stream processing using Kafka Streams, Apache Flink®, and SQL implementations, then cover integration of relational databases with real-time analytics databases such as Apache Pinot® and ClickHouse. Finally, we’ll dive into modern approaches like Apache Iceberg® with Tableflow, which simplifies data preparation by seamlessly representing Kafka topics and associated schemas as Iceberg or Delta tables in a few clicks. While there's no single right answer to this problem, as responsible system builders, we must understand our options and trade-offs to build robust architectures.

Building a user facing dashboard with Apache Flink and ClickHouse

2025-03-25 · The Spring Time Data Meetup

talk

by Ashish Bagri (GlassFlow) , Pablo Garcia (GlassFlow)

ClickHouse

Intro to Apache Kafka

2024-10-22 · IN-PERSON: Apache Kafka® Meetup Berlin - October 2024

talk

by David Anderson (Confluent)

Kafka kafka connect kafka streams schema registry

Dive into the world of real-time data streaming with this introduction to Apache Kafka. This talk is tailored for developers, data engineers, and IT professionals who want to gain a foundational understanding of Kafka, a powerful open-source platform used for building scalable, event-driven applications.You will learn about:

Kafka fundamentals: the core concepts of Kafka, including topics, partitions, producers, and consumers

The Kafka ecosystem: brokers, clients, Schema Registry, and Kafka Connect

Stream processing: Kafka Streams and Apache Flink

Use cases: discover how data streaming with Kafka has transformed various industries

Visualizing Realtime Stock Data with Streamlit, Apache Kafka®, and Apache Flink®

2024-10-03 · Apache Kafka® x Apache Flink® x Elastic

talk

Kafka streamlit

Data pipelines with Apache Flink

2024-09-26 · VIRTUAL: Apache Flink® Meetup

talk

by Raghav Nehru (Platformatory)

Kafka

This talk walks through the process of creating real-time data pipelines using Flink. It introduces how to connect Flink with various data sources (like Kafka, or relational databases), focusing on transforming and enriching data streams. This talk is useful for understanding how Flink integrates with other components in a typical data processing pipeline.

Introduction to Stateful Stream Processing with Apache Flink

2024-09-26 · VIRTUAL: Apache Flink® Meetup

talk

by Robert Metzger (Confluent)

Kafka

Stream Processing has evolved quickly in a short time: only a few years ago, stream processing was mostly simple real-time aggregations with limited throughput and consistency. Today, many stream processing applications have sophisticated business logic, strict correctness guarantees, high performance, low latency, and maintain terabytes of state without databases. Stream processing frameworks also abstract a lot of the low-level details away, such as routing the data streams, taking care of concurrent executions, and handling various failure scenarios while ensuring correctness.

This talk will give an introduction into Apache Flink, one of the most advanced open source stream processors that powers applications in Netflix, Uber, and Alibaba among others. In particular, we will go through the use cases that Flink was designed for, explain concepts like stateful and event-time stream processing, and discuss Flink's APIs and ecosystem.

From Postgres to OpenSearch in No Time: Using Debezium for log-based change data capture (CDC) and Apache Flink for stream processing

2024-06-18 · Data Berlin Midsummer Meetup - Speeding up Analytics

talk

by Gunnar Morling (Decodable)

cdc debezium opensearch postgresql

talk-data.com

Activity Trend

Top Events

Top Speakers

From Crypto Streams to AI-Powered Predictions

Data pipeline troubleshooting: Root cause analysis with lineage metadata

On-the-Fly State Migration: Keeping Your Flink Pipelines Streaming

The Real-Time Data Journey: Connecting Flink, Airflow, and StarRocks - Exploring how modern streaming tools power the next generation of analytics

The lifetime of a write, 3 ways: in Postgres, Kafka and Flink

Delta Force: What the Fuss with Fluss in Flink 2.x

Delta Force: What the Fuss with Fluss in Flink 2.x

Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry

Stream All the Things - Patterns of Effective Data Stream Processing

Real-Time Manufacturing Insights with Apache Flink and Kafka

Real-Time Manufacturing Insights with Apache Flink and Kafka

Spinning up an Event Streaming Environment from Scratch Confluent Cloud, Terraform, Connectors and Flink in Action

Spinning up an Event Streaming Environment from Scratch Confluent Cloud, Terraform, Connectors and Flink in Action

One Does Not Simply Query a Stream

Building a user facing dashboard with Apache Flink and ClickHouse

Intro to Apache Kafka

Visualizing Realtime Stock Data with Streamlit, Apache Kafka®, and Apache Flink®

Data pipelines with Apache Flink

Introduction to Stateful Stream Processing with Apache Flink

From Postgres to OpenSearch in No Time: Using Debezium for log-based change data capture (CDC) and Apache Flink for stream processing