Search – talk-data.com

Title & Speakers	Event
The High Performance Data and AI Debate 2025-09-24 · 18:00 Chris Tabb – CCO @ LEIT DATA Join us for an unmissable evening of insight, discussion, and lively debate at The High Performance Data and AI Debate, hosted by Chris Tabb — a unique Big Data London special running from 6:00–8:00 PM. This fast-paced, interactive event brings together some of the brightest minds in data and AI to tackle the most pressing questions shaping the future of teams, architecture, and products in an AI-first world. The evening kicks off at 6:00 PM with a welcome and free drinks. Then, across three rapid-fire 20-minute debates, our expert panels will explore: • AI & Data – Teams (Chair: Eevamaija Virtanen) Mehdi Ouazza, Paul Rankin, Jesse Anderson, Hugo Lu • AI & Data – Architecture (Chair: Adi Polak) Chris Freestone, David Richardson, Nick White, Karl Ivo Sokolov • AI & Data – Products (Chair: Jai Parmar) Kelsey Hammock, Jean-Georges (jgp) Perrin, Taylor McGrath, Jon Cooke Refuel with free pizza at 6:50 PM, then stay for the Town Hall Debate, where all speakers return to the stage for an open-floor Q&A — your chance to challenge their ideas, share perspectives, and shape the conversation. Expect fresh perspectives, healthy disagreement, and practical takeaways you can bring back to your organisation. Whether you’re leading a data team, designing cutting-edge architectures, or building AI-powered products, this is your space to engage with the people shaping what’s next. AI/ML Big Data	Big Data LDN 2025
Why Kafka + Iceberg Will Define the Next Decade of Data Infrastructure 2025-09-23 · 19:45 Tom Scott – Founder & CEO @ Streambased Data leaders today face a familiar challenge: complex pipelines, duplicated systems, and spiraling infrastructure costs. Standardizing around Kafka for real-time and Iceberg for large-scale analytics has gone some way towards addressing this but still requires separate stacks, leaving teams to stitch them together at high expense and risk. This talk will explore how Kafka and Iceberg together form a new foundation for data infrastructure. One that unifies streaming and analytics into a single, cost-efficient layer. By standardizing on these open technologies, organizations can reduce data duplication, simplify governance, and unlock both instant insights and long-term value from the same platform. You will come away with a clear understanding of why this convergence is reshaping the industry, how it lowers operational risk, and advantages it offers for building durable, future-proof data capabilities. Kafka Iceberg	IN PERSON: Apache Kafka x Apache Flink
Why Kafka + Iceberg Will Define the Next Decade of Data Infrastructure 2025-09-23 · 19:45 Tom Scott – Founder & CEO @ Streambased Data leaders today face a familiar challenge: complex pipelines, duplicated systems, and spiraling infrastructure costs. Standardizing around Kafka for real-time and Iceberg for large-scale analytics has gone some way towards addressing this but still requires separate stacks, leaving teams to stitch them together at high expense and risk. This talk will explore how Kafka and Iceberg together form a new foundation for data infrastructure. One that unifies streaming and analytics into a single, cost-efficient layer. By standardizing on these open technologies, organizations can reduce data duplication, simplify governance, and unlock both instant insights and long-term value from the same platform. You will come away with a clear understanding of why this convergence is reshaping the industry, how it lowers operational risk, and advantages it offers for building durable, future-proof data capabilities. Kafka Iceberg	IN-PERSON: Apache Kafka® x Apache Flink® Meetup
Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry 2025-09-23 · 19:15 Mehreen Tahir – Software Engineer @ New Relic By leveraging tools like Jaeger and New Relic, we will uncover how to gain a full view of your microservices, even in the face of Apache Kafka's asynchronous nature. Join us for a live demo with a simple Java Spring-Boot app, where we will walk through both automatic and manual instrumentation to capture rich telemetry. We will also touch on infrastructure-level observability, pulling metrics and traces from Apache Kafka brokers and Apache Flink. opentelemetry jaeger New Relic Kafka flink java spring-boot	IN PERSON: Apache Kafka x Apache Flink
Event IN-PERSON: Apache Kafka® x Apache Flink® Meetup 2025-09-23
Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry 2025-09-23 · 19:15 Mehreen Tahir – Software Engineer @ New Relic By leveraging tools like Jaeger and New Relic, we’ll uncover how to gain a full view of your microservices, even in the face of Apache Kafka’s asynchronous nature. Join us for a live demo with a simple Java Spring-Boot app, where we’ll walk through both automatic and manual instrumentation to capture rich telemetry. We’ll also touch on infrastructure-level observability, pulling metrics and traces from Apache Kafka brokers and Apache Flink. opentelemetry jaeger New Relic Kafka spring boot
Stream All the Things - Patterns of Effective Data Stream Processing 2025-09-23 · 18:45 Adi Polak – Director of Advocacy and Developer Experience Engineering @ Confluent Data streaming is a really difficult problem. Despite 10+ years of attempting to simplify it, teams building real-time data pipelines can spend up to 80% of their time optimizing it or fixing downstream output by handling bad data at the lake. All we want is a service that will be reliable, handle all kinds of data, connect with all kinds of systems, be easy to manage, and scale up and down as our systems change. Oh, it should also have super low latency and result in good data. Is it too much to ask? In this presentation, you’ll learn the basics of data streaming and architecture patterns such as DLQ, used to tackle these challenges. We will then explore how to implement these patterns using Apache Flink and discuss the challenges that real-time AI applications bring to our infra. Difficult problems are difficult, and we offer no silver bullets. Still, we will share pragmatic solutions that have helped many organizations build fast, scalable, and manageable data streaming pipelines. flink data streaming

TBA 2025-09-23 · 18:45 Adi Polak – Director of Advocacy and Developer Experience Engineering @ Confluent Abstract: TBA	IN PERSON: Apache Kafka x Apache Flink
IN PERSON: Apache Kafka meets Apache Flink 2025-01-23 · 18:00 Join us for our very first meetup of 2025! You'll learn all about how to use Apache Kafka beyond the consumer protocol and get an introduction to Apache Flink. Date and Time: 🗓️ Thursday 23rd January, ⏰ 18:00 - 20:30 PM 🕘 Venue: Confluent Europe Ltd, 262 High Holborn, London WC1V 7EE, United Kingdom Attending Brands: OSO, Streambased, Gravitee & Confluent Schedule: 18:00: Doors Open 18:00 - 18:30: Food, drinks, networking 18:30 - 19:00: "Accessing Kafka: beyond the consumer protocol" - Tom Scott (CEO Streambased) & Linus Hakansson (CPO Gravitee) 19:00 - 19:30: “Flink - Adi Polak (Director Developer Experience Engineering and Advocacy, Confluent) 19:30- 20:30pm: Additional Q&A, Networking 🎙️ \~Talk 1\~ Talk Title: Accessing Kafka: beyond the consumer protocol Summary: The role of Kafka is expanding and with it the use cases it addresses. This brings many cool new features but also highlights some drawbacks. The standard producer/consumer pattern that has served us so well for so many years is no longer a good fit for all the things that Kafka data is used for and it's time to look beyond.Join Linus (Gravitee) and Tom (Streambased) for an in depth look at how you can interact with you Kafka clusters via REST, GraphQL, WebSockets, JDBC/ODBC and even as a simple filesystem.We'll outline the reasoning behind these new access patterns, the features that differentiate them (and the features that unite them) and show some live demos of the opportunities they create. 🗣️ Speaker 1: Tom Scott (CEO Streambased) is the founder of Streambased, Tom is building multi tenant, on-prem and cloud Kafka services to attack common Kafka pain points and break down barriers to starting your data journey.Linus Hakansson Linus Hakansson is the Chief Product Officer at Gravitee, building a next-generation management platform helping organizations secure, control and govern their Kafka and APIs 🗣️ Speaker 2: Linus Hakansson is the Chief Product Officer at Gravitee, building a next-generation management platform helping organizations secure, control and govern their Kafka and APIs 🎙️ \~Talk 2\~ Talk Title: Flink - demystifying data streaming Summary: In an era where data velocity and volume continue to grow, the ability to process and analyze data streams in real-time is pivotal for businesses aiming to optimize operations, enhance decision-making, and maintain competitive advantages. Apache Flink stands out as a comprehensive, open-source stream processing framework designed to meet these challenges head-on. In this session you will learn about data streaming through the lens of Apache Flink, offering insights into its architecture, capabilities, and how it seamlessly facilitates real-time data processing.Objectives: 1. Introduce Stream Processing: Provide a foundational understanding of stream processing - its importance\, use cases\, and when to use in comparison to batch processing. 2. Explore Apache Flink: Deep dive into Apache Flink's architecture\, key features\, and its unique approach to handling stateful computations\, event time processing\, and ensuring fault tolerance at scale. 🗣️ Speaker 1: Adi Polak, Director Developer Experience Engineering and Advocacy, Confluent. Adi is an experienced software engineer and people manager. For most of her professional life, she dealt with data and machine learning for transactional and analytics workloads by building large-scale systems. As a data practitioner, she developed software to solve real-world problems with Apache Spark, Kafka, HDFS, K8s, AWS, and Azure in high-throughput, high-scale production environments for companies like Akamai and Microsoft.Adi has taught Spark to thousands of students throughout the years and is the author of the successful book — Scaling Machine Learning with Spark. When not thinking up new architecture, teaching new tech or pondering on a distributed systems challenge, you can find her at the local cultural scene.	IN PERSON: Apache Kafka meets Apache Flink
Kafka Streams in Action, Second Edition 2024-05-24 Bill Bejeck – author Everything you need to implement stream processing on Apache KafkaⓇ using Kafka Streams and the kqsIDB event streaming database. Kafka Streams in Action, Second Edition guides you through setting up and maintaining your streaming processing with Kafka. Inside, you’ll find comprehensive coverage of not only Kafka Streams, but the entire toolbox you’ll need for effective streaming—from the components of the Kafka ecosystem, to Producer and Consumer clients, Connect, and Schema Registry. In Kafka Streams in Action, Second Edition you’ll learn how to: Design streaming applications in Kafka Streams with the KStream and the Processor API Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry Build applications that respond immediately to events in either Kafka Streams or ksqlDB Craft materialized views over streams with ksqlDB This totally revised new edition of Kafka Streams in Action has been expanded to cover more of the Kafka platform used for building event-based applications. You’ll also find full coverage of ksqlDB, an event streaming database that makes it a snap to create applications that respond immediately to events, such as real-time push and pull updates. About the Technology Enterprise applications need to handle thousands—even millions—of data events every day. With an intuitive API and flawless reliability, the lightweight Kafka Streams library has earned a spot at the center of these systems. Kafka Streams provides exactly the power and simplicity you need to manage real-time event processing or microservices messaging. About the Book Kafka Streams in Action, Second Edition teaches you how to create event streaming applications on the amazing Apache Kafka platform. This thoroughly revised new edition now covers a wider range of streaming architectures and includes data integration with Kafka Connect. As you go, you’ll explore real-world examples that introduce components and brokers, schema management, and the other essentials. Along the way, you’ll pick up practical techniques for blending Kafka with Spring, low-level control of processors and state stores, storing event data with ksqlDB, and testing streaming applications. What's Inside Design efficient streaming applications Integrate external systems with Kafka Connect Enforce data compatibility with Schema Registry About the Reader For Java developers. No knowledge of Kafka or streaming applications required. About the Author Bill Bejeck is a Confluent engineer and a Kafka Streams contributor with over 15 years of software development experience. Bill is also a committer on the Apache KafkaⓇ project. Quotes Comprehensive streaming data applications are only a few years away from becoming the reality, and this book is the guide the industry has been waiting for to move beyond the hype. - Adi Polak, Director, Developer Experience Engineering, Confluent Covers all the key aspects of building applications with Kafka Streams. Whether you are getting started with stream processing or have already built Kafka Streams applications, it is an essential resource. - Mickael Maison, Principal Software Engineer, Red Hat Serves as both a learning and a resource guide, offering a perfect blend of ‘how-to’ and ‘why-to.’ Even if you have been using Kafka Streams for many years, I highly recommend this book. - Neil Buesing, CTO & Co-founder, Kinetic Edge data data-engineering streaming-messaging Kafka API Java Data Streaming	O'Reilly Data Engineering Books
Event Databricks DATA + AI Summit 2023 2023-07-26
Data Biases and Generative AI: A Practitioner Discussion 2023-07-26 · 21:04 Gavita Regunath , Layla Yang , Christina Taylor , Adi Polak – VP of Developer Experience @ Treeverse Join this panel discussion that unpacks the technical challenges surrounding biases found in data, and poses potential solutions and strategies for the future including Generative AI. This session is a showcase highlighting diverse perspectives in the data and AI industry. Talk by: Adi Polak, Gavita Regunath, Christina Taylor, and Layla Yang Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc AI/ML Databricks GenAI	YouTube
Live from the Lakehouse: Ethics in AI with Adi Polak & gaining from open source with Vini Jaiswal 2023-07-14 · 23:38 Vini Jaiswal – Principal Developer Advocate @ ByteDance , Adi Polak – VP of Developer Experience @ Treeverse Hear from two guests. First, Adi Polak (VP of Developer Experience, Treeverse, and author of #1 new release - Scaling ML with Spark) on how AI helps us be more productive. Second guest, Vini Jaiswal (Principal Developer Advocate, ByteDance) on gaining with the open source community, overcoming scalability challenges, and taking innovation to the next stage. Hosted by Pearl Ubaru (Sr Technical Marketing Engineer, Databricks) AI/ML Data Lakehouse Databricks Marketing Spark	YouTube

Scaling Machine Learning with Spark 2023-03-09 Adi Polak – author Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better. Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology. You will: Explore machine learning, including distributed computing concepts and terminology Manage the ML lifecycle with MLflow Ingest data and perform basic preprocessing with Spark Explore feature engineering, and use Spark to extract features Train a model with MLlib and build a pipeline to reproduce it Build a data system to combine the power of Spark with deep learning Get a step-by-step example of working with distributed TensorFlow Use PyTorch to scale machine learning and its internal architecture data data-engineering apache-spark AI/ML PyTorch Spark TensorFlow	O'Reilly Data Engineering Books

talk-data.com

People (11 results)

Activities & events