talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
IN PERSON: Apache Kafka® x Apache Iceberg x Apache Flink®
2025-05-07 · 18:00
***IMPORTANT: IF YOU RSVP here you don't need to also RSVP to London Kafka Group.*** Date and Time: 🗓️ Wednesday 7th May, ⏰ 18:00 - 21:00 PM 🕘 Venue: Snowflake, One Crown Place, London EC2A 4EF, U.K. 5th & 6th floors · London Schedule:
🎙️ \~Talk 1\~ Mastering real-time anomaly detection, Olena Kutsenko, Staff Developer Advocate, Confluent Abstract: Detecting problems as they happen is essential in today's fast-moving world. This talk shows how to build a simple, powerful system for real-time anomaly detection in live data. We'll use Apache Kafka for streaming data, Apache Flink for processing it in real time, and various models to detect unusual patterns. Whether it's monitoring systems, or tracking IoT devices, this solution is flexible and reliable. We'll start by exploring how Kafka helps collect and manage fast-moving data streams. Then, we'll demonstrate how Flink processes this data in real time and integrates anomaly detection models to uncover events as they occur. We'll dive into the details of how ARIMA and LSTM work, so even if you’re not into mathematics, you can still understand what happens behind the scenes! This talk is ideal for anyone looking to monitor anomalies in real-time data streams. 🗣️ Speaker 1: Olena is a Staff Developer Advocate at Confluent and a recognized expert in data streaming and analytics. With two decades of experience in software engineering, she has built mission-critical applications, led high-performing teams, and driven large-scale technology adoption at industry leaders like Nokia, HERE Technologies, AWS, and Aiven. 🎙️ \~Talk 2\~ Iced Kaf-fee: Chilling Kafka Data into Iceberg Tables, Danica Fine, Lead Developer Advocate, Open Source at Snowflake Abstract: Have piping-hot, real-time data in Apache Kafka® but want to chill it down into Apache Iceberg™ tables? Let’s see how we can craft the perfect cup of “Iced Kaf-fee” for you and your needs! We’ll start by grinding through the motivation for moving data from Kafka topics into Iceberg tables, exploring the benefits that doing so has to offer your analytics workflows. From there, we’ll open up the menu of options available to cool down your streams, including Apache Flink®, Apache Spark™, and Kafka Connect. Each brewing method has its own recipe, so we’ll compare their pros and cons, walk through use cases for each, and highlight when you might prefer a strong Spark roast over a smooth Flink blend—or maybe a Connect cold brew. Plus, we’ll share a sneak peek at future innovations that are percolating in the community to make sinking your Kafka data into Iceberg even easier. By the end of the session, you’ll have everything you need to whip up the perfect pipeline and serve up your “Iced Kaf-fee” with confidence. 🗣️ Speaker 2: Danica began her career as a software engineer in financial services and pivoted to developer relations, where she focussed primarily on open source technologies under the Apache Software Foundation umbrella such as Apache Kafka and Apache Flink. She now leads the open source advocacy efforts at Snowflake, supporting Apache Iceberg and Apache Polaris (incubating). 🎙️ \~Talk 3\~ Observing all the things: Apache Kafka® and Apache Flink® with OpenTelemetry, Mehreen Tahir Software Engineer, New Relic 🗣️ Speaker 3: Mehreen specializes in machine learning, data science, and artificial intelligence. Mehreen is passionate about observability and the use of telemetry data to improve application performance. She actively contributes to developer communities and has a keen interest in edge analytics and serverless architecture. *** DISCLAIMER NOTE: We are unable to cater for any attendees under the age of 18. If you would like to speak or host our next event please let us know! [email protected] |
IN PERSON: Apache Kafka® x Apache Iceberg x Apache Flink®
|
|
IN PERSON: Apache Kafka® x Apache Iceberg™ Meetup
2025-04-14 · 16:00
Join us for an Apache Kafka® Meetup on Monday, April 14th from 6:00pm hosted by Elaia! Elaia is a full stack tech and deep tech investor. We partner with ambitious entrepreneurs from inception to leadership, helping them navigate the future and the unknown. For over twenty years, we have combined deep scientific and technological expertise with decades of operational experience to back those building tomorrow. From our offices in Paris, Barcelona and Tel Aviv, we have been active partners with over 100 startups including Criteo, Mirakl, Shift Technology, Aqemia and Alice & Bob. 📍Venue: Elaia 21 Rue d'Uzès, 75002 Paris, France IF YOU RSVP HERE, YOU DO NOT NEED TO RSVP @ Paris Apache Kafka® Meetup group. 🗓 Agenda:
💡 Speaker One: Roman Kolesnev, Principal Software Engineer, Streambased Talk: Melting Icebergs: Enabling Analytical Access to Kafka Data through Iceberg Projections Abstract: An organisation's data has traditionally been split between the operational estate, for daily business operations, and the analytical estate for after-the-fact analysis and reporting. The journey from one side to the other is today a long and torturous one. But does it have to be? In the modern data stack Apache Kafka is your defacto standard operational platform and Apache Iceberg has emerged as the champion of table formats to power analytical applications. Can we leverage the best of Iceberg and Kafka to create a powerful solution greater than the sum of its parts? Yes you can and we did! This isn't a typical story of connectors, ELT, and separate data stores. We've developed an advanced projection of Kafka data in an Iceberg-compatible format, allowing direct access from warehouses and analytical tools. In this talk, we'll cover: * How we presented Kafka data for Iceberg processors without moving or transforming data upfront—no hidden ETL! * Integrating Kafka's ecosystem into Iceberg, leveraging Schema Registry, consumer groups, and more. * Meeting Iceberg's performance and cost reduction expectations while sourcing data directly from Kafka. Expect a technical deep dive into the protocols, formats, and services we used, all while staying true to our core principles: * Kafka as the single source of truth—no separate stores. * Analytical processors shouldn't need Kafka-specific adjustments. * Operational performance must remain uncompromised. * Kafka's mature ecosystem features, like ACLs and quotas, should be reused, not reinvented. Join us for a thrilling account of the highs and lows of merging two data giants and stay tuned for the surprise twist at the end! Bio: Roman is a Principal Software Engineer at Streambased. His experience includes building business critical event streaming applications and distributed systems in the financial and technology sectors. 💡 Speaker Two: Viktor Gamov, Principal Developer Advocate, Confluent One Does Not Simply Query a Stream Abstract: Streaming data with Apache Kafka® has become the backbone of modern day applications. While streams are ideal for continuous data flow, they lack built-in querying capability. Unlike databases with indexed lookups, Kafka's append-only logs are designed for high throughput processing, not for on-demand querying. This necessitates teams to build additional infrastructure to enable query capabilities for streaming data. Traditional methods replicate this data into external stores such as relational databases like PostgreSQL for operational workloads and object storage like S3 with Flink, Spark, or Trino for analytical use cases. While useful sometimes, these methods deepen the divide between operational and analytical estates, creating silos, complex ETL pipelines, and issues with schema mismatches, freshness, and failures. In this session, we’ll explore and see live demos of some solutions to unify the operational and analytical estates, eliminating data silos. We’ll start with stream processing using Kafka Streams, Apache Flink®, and SQL implementations, then cover integration of relational databases with real-time analytics databases such as Apache Pinot® and ClickHouse. Finally, we’ll dive into modern approaches like Apache Iceberg® with Tableflow, which simplifies data preparation by seamlessly representing Kafka topics and associated schemas as Iceberg or Delta tables in a few clicks. While there's no single right answer to this problem, as responsible system builders, we must understand our options and trade-offs to build robust architectures. Bio: Viktor Gamov is a Principal Developer Advocate at Confluent, founded by the original creators of Apache Kafka®. . With a rich background in implementing and advocating for distributed systems and cloud-native architectures, Viktor excels in open-source technologies. He is passionate about assisting architects, developers, and operators in crafting systems that are not only low in latency and scalable but also highly available. As a Java Champion and an esteemed speaker, Viktor is known for his insightful presentations at top industry events like JavaOne, Devoxx, Kafka Summit, and QCon. His expertise spans distributed systems, real-time data streaming, JVM, and DevOps. Viktor has co-authored "Enterprise Web Development" from O'Reilly and "Apache Kafka® in Action" from Manning. Follow Viktor on X - @gamussa to stay updated with Viktor's latest thoughts on technology, his gym and food adventures, and insights into open-source and developer advocacy. *** DISCLAIMER We cannot cater to those under the age of 18. If you would like to speak at / host a future meetup, please reach out to [email protected] |
IN PERSON: Apache Kafka® x Apache Iceberg™ Meetup
|