debezium

Data pipeline troubleshooting: Root cause analysis with lineage metadata

2025-11-27 · Data pipeline troubleshooting: Root cause analysis with lineage metadata

talk

by Mario Fiore Vitale (Debezium)

flink marquez openlineage

Remember when debugging streaming data pipelines felt like playing detective at a crime scene, where the evidence kept shifting? Well, grab your magnifying glass because we’re about to turn you into Sherlock Holmes of the streaming world. We’ll simulate a disruptive change in an order processing pipeline that captures database changes with Debezium, processes them through Apache Flink, and tracks lineage metadata with OpenLineage and Marquez.

Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synchronization Algorithm

2025-07-02 · Debezium, Apache Kafka®, and an Acyclic Synchronization Algorithm

talk

by MD Sayem Ahmed (Kleinanzeigen)

Kafka

In this talk we will look into the details of how Kleinanzeigen, a leader in classifieds business in Germany, built a data migration system using Apache Kafka and Debezium that migrated millions of users' data from a legacy to a new platform and allowed bi-directional data sync between them in real time. We will also discover how the system allowed user's data to be updated on both platforms (partially, under certain conditions) while keeping the entire system in sync. Finally, we will learn how the system leveraged a logical clock to implement a custom synchronization algorithm that helped avoid infinite update loops between the platforms.

Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synchronization Algorithm

2025-07-02 · Debezium, Apache Kafka®, and an Acyclic Synchronization Algorithm

talk

by MD Sayem Ahmed (Kleinanzeigen)

Kafka

In this talk we will look into the details of how Kleinanzeigen, a leader in classifieds business in Germany, built a data migration system using Apache Kafka and Debezium that migrated millions of users' data from a legacy to a new platform and allowed bi-directional data sync between them in real time. We will also discover how the system allowed user's data to be updated on both platforms (partially, under certain conditions) while keeping the entire system in sync. Finally, we will learn how the system leveraged a logical clock to implement a custom synchronization algorithm that helped avoid infinite update loops between the platforms.

From Postgres to OpenSearch in No Time: Using Debezium for log-based change data capture (CDC) and Apache Flink for stream processing

2024-06-18 · Data Berlin Midsummer Meetup - Speeding up Analytics

talk

by Gunnar Morling (Decodable)

cdc flink opensearch postgresql

From Postgres to OpenSearch in No Time

2023-11-30 · Berlin Open Source Data Infrastructure Meetup - November 2023

talk

by Gunnar Morling (Decodable)

flink opensearch

Abstract: You've been tasked with implementing a data streaming pipeline for propagating data changes from your operational Postgres database to a search index in OpenSearch. Data views in OS should be denormalized for fast querying, and of course there should be no noticeable impact on the production database. In this session we'll discuss how to build this data pipeline using two popular open-source projects: Debezium for log-based change data capture (CDC) and Apache Flink for stream processing. Join us for this talk and learn about: * Setting up change data streams with Debezium * Efficiently building nested data structures from 1:n joins * Deployment options: Kafka Connect vs. Flink CDC

talk-data.com

Activity Trend

Top Events

Top Speakers

Data pipeline troubleshooting: Root cause analysis with lineage metadata

Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synchronization Algorithm

Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synchronization Algorithm

From Postgres to OpenSearch in No Time: Using Debezium for log-based change data capture (CDC) and Apache Flink for stream processing

From Postgres to OpenSearch in No Time