talk-data.com

People (257 results)

See all 257 →

Scott Vetter

author

33 activities

Scott Taylor

Data Whisperer · MetaMeta Consulting

13 activities

Michael Scott

author

13 activities

Showing 16 results

Activities & events

Title & Speakers	Event
Why Kafka + Iceberg Will Define the Next Decade of Data Infrastructure 2025-09-23 · 19:45 Tom Scott – Founder & CEO @ Streambased Data leaders today face a familiar challenge: complex pipelines, duplicated systems, and spiraling infrastructure costs. Standardizing around Kafka for real-time and Iceberg for large-scale analytics has gone some way towards addressing this but still requires separate stacks, leaving teams to stitch them together at high expense and risk. This talk will explore how Kafka and Iceberg together form a new foundation for data infrastructure. One that unifies streaming and analytics into a single, cost-efficient layer. By standardizing on these open technologies, organizations can reduce data duplication, simplify governance, and unlock both instant insights and long-term value from the same platform. You will come away with a clear understanding of why this convergence is reshaping the industry, how it lowers operational risk, and advantages it offers for building durable, future-proof data capabilities. Kafka Iceberg	IN PERSON: Apache Kafka x Apache Flink
Why Kafka + Iceberg Will Define the Next Decade of Data Infrastructure 2025-09-23 · 19:45 Tom Scott – Founder & CEO @ Streambased Data leaders today face a familiar challenge: complex pipelines, duplicated systems, and spiraling infrastructure costs. Standardizing around Kafka for real-time and Iceberg for large-scale analytics has gone some way towards addressing this but still requires separate stacks, leaving teams to stitch them together at high expense and risk. This talk will explore how Kafka and Iceberg together form a new foundation for data infrastructure. One that unifies streaming and analytics into a single, cost-efficient layer. By standardizing on these open technologies, organizations can reduce data duplication, simplify governance, and unlock both instant insights and long-term value from the same platform. You will come away with a clear understanding of why this convergence is reshaping the industry, how it lowers operational risk, and advantages it offers for building durable, future-proof data capabilities. Kafka Iceberg	IN-PERSON: Apache Kafka® x Apache Flink® Meetup
Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry 2025-09-23 · 19:15 Mehreen Tahir – Software Engineer @ New Relic By leveraging tools like Jaeger and New Relic, we will uncover how to gain a full view of your microservices, even in the face of Apache Kafka's asynchronous nature. Join us for a live demo with a simple Java Spring-Boot app, where we will walk through both automatic and manual instrumentation to capture rich telemetry. We will also touch on infrastructure-level observability, pulling metrics and traces from Apache Kafka brokers and Apache Flink. opentelemetry jaeger New Relic Kafka flink java spring-boot	IN PERSON: Apache Kafka x Apache Flink
Event IN-PERSON: Apache Kafka® x Apache Flink® Meetup 2025-09-23
Monitoring Kafka-Based Applications Using Distributed Tracing with OpenTelemetry 2025-09-23 · 19:15 Mehreen Tahir – Software Engineer @ New Relic By leveraging tools like Jaeger and New Relic, we’ll uncover how to gain a full view of your microservices, even in the face of Apache Kafka’s asynchronous nature. Join us for a live demo with a simple Java Spring-Boot app, where we’ll walk through both automatic and manual instrumentation to capture rich telemetry. We’ll also touch on infrastructure-level observability, pulling metrics and traces from Apache Kafka brokers and Apache Flink. opentelemetry jaeger New Relic Kafka spring boot
Stream All the Things - Patterns of Effective Data Stream Processing 2025-09-23 · 18:45 Adi Polak – Director of Advocacy and Developer Experience Engineering @ Confluent Data streaming is a really difficult problem. Despite 10+ years of attempting to simplify it, teams building real-time data pipelines can spend up to 80% of their time optimizing it or fixing downstream output by handling bad data at the lake. All we want is a service that will be reliable, handle all kinds of data, connect with all kinds of systems, be easy to manage, and scale up and down as our systems change. Oh, it should also have super low latency and result in good data. Is it too much to ask? In this presentation, you’ll learn the basics of data streaming and architecture patterns such as DLQ, used to tackle these challenges. We will then explore how to implement these patterns using Apache Flink and discuss the challenges that real-time AI applications bring to our infra. Difficult problems are difficult, and we offer no silver bullets. Still, we will share pragmatic solutions that have helped many organizations build fast, scalable, and manageable data streaming pipelines. flink data streaming

TBA 2025-09-23 · 18:45 Adi Polak – Director of Advocacy and Developer Experience Engineering @ Confluent Abstract: TBA	IN PERSON: Apache Kafka x Apache Flink
The Future of Data Mesh 2025-02-26 · 12:00 Tom DeWolf – Data mesh platform and platform engineering expert; innovation lead @ ACA Group , Michael Toland – Product Management Coach and Consultant @ Pathfinder Product S1 Ep#34: The Future of Data Mesh The Data Product Management In Action podcast, brought to you by executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. In Season 01, Episode 34, Join host Michael Toland as he welcomes Tom DeWolf, a data mesh expert with a PhD in distributed systems and years of experience in software engineering. Tom shares insights from his four-year journey in data mesh, emphasizing the need for self-service in data products, the benefits of an evolutionary architecture, and the challenges of governance in multi-organization environments. He also discusses key lessons from past failures, highlighting the critical role of user engagement in building successful data ecosystems. Don't miss this deep dive into the future of data management! About our Host Michael Toland: Michael is a Product Management Coach and Consultant with Pathfinder Product, a Test Double Operation. He has worked in product officially since 2016, where he worked at Verizon on large scale system modernizations and migration initiatives for reference data and decision platforms. Outside of his professional career, Michael serves as the Treasurer for the New Leaders Council, mentors fellows with Venture for America, sings in the Columbus Symphony, writing satire posts for his blog Dignified Product or Test Double, depending on the topic, and is excited to be chatting with folks on Data Product Management. Connect with Michael on LinkedIn. About our guest Tom DeWolf: Tom is an experienced hands-on architect and serves as the innovation lead, spearheading new innovative initiatives for ACA Group. His expertise lies in data mesh platform and platform engineering, leveraging his background in software engineering and experience in designing various architectures, including software, microservices, data platforms, evolutionary architectures, among others. Tom is the founder and host of Data Mesh Belgium meetup and the new Data Mesh Live conference, and active Data Mesh community thought leader. Connect with Tom on LinkedIn. All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn. Apply to be a guest or nominate someone that you know. Do you love what you're listening to? Please rate and review the podcast, and share it with fellow practitioners you know. Your support helps us reach more listeners and continue providing valuable insights! Data Management	Data Product Management in Action: The Practitioner's Podcast Listen
IN PERSON: Apache Kafka meets Apache Flink 2025-01-23 · 18:00 Join us for our very first meetup of 2025! You'll learn all about how to use Apache Kafka beyond the consumer protocol and get an introduction to Apache Flink. Date and Time: 🗓️ Thursday 23rd January, ⏰ 18:00 - 20:30 PM 🕘 Venue: Confluent Europe Ltd, 262 High Holborn, London WC1V 7EE, United Kingdom Attending Brands: OSO, Streambased, Gravitee & Confluent Schedule: 18:00: Doors Open 18:00 - 18:30: Food, drinks, networking 18:30 - 19:00: "Accessing Kafka: beyond the consumer protocol" - Tom Scott (CEO Streambased) & Linus Hakansson (CPO Gravitee) 19:00 - 19:30: “Flink - Adi Polak (Director Developer Experience Engineering and Advocacy, Confluent) 19:30- 20:30pm: Additional Q&A, Networking 🎙️ \~Talk 1\~ Talk Title: Accessing Kafka: beyond the consumer protocol Summary: The role of Kafka is expanding and with it the use cases it addresses. This brings many cool new features but also highlights some drawbacks. The standard producer/consumer pattern that has served us so well for so many years is no longer a good fit for all the things that Kafka data is used for and it's time to look beyond.Join Linus (Gravitee) and Tom (Streambased) for an in depth look at how you can interact with you Kafka clusters via REST, GraphQL, WebSockets, JDBC/ODBC and even as a simple filesystem.We'll outline the reasoning behind these new access patterns, the features that differentiate them (and the features that unite them) and show some live demos of the opportunities they create. 🗣️ Speaker 1: Tom Scott (CEO Streambased) is the founder of Streambased, Tom is building multi tenant, on-prem and cloud Kafka services to attack common Kafka pain points and break down barriers to starting your data journey.Linus Hakansson Linus Hakansson is the Chief Product Officer at Gravitee, building a next-generation management platform helping organizations secure, control and govern their Kafka and APIs 🗣️ Speaker 2: Linus Hakansson is the Chief Product Officer at Gravitee, building a next-generation management platform helping organizations secure, control and govern their Kafka and APIs 🎙️ \~Talk 2\~ Talk Title: Flink - demystifying data streaming Summary: In an era where data velocity and volume continue to grow, the ability to process and analyze data streams in real-time is pivotal for businesses aiming to optimize operations, enhance decision-making, and maintain competitive advantages. Apache Flink stands out as a comprehensive, open-source stream processing framework designed to meet these challenges head-on. In this session you will learn about data streaming through the lens of Apache Flink, offering insights into its architecture, capabilities, and how it seamlessly facilitates real-time data processing.Objectives: 1. Introduce Stream Processing: Provide a foundational understanding of stream processing - its importance\, use cases\, and when to use in comparison to batch processing. 2. Explore Apache Flink: Deep dive into Apache Flink's architecture\, key features\, and its unique approach to handling stateful computations\, event time processing\, and ensuring fault tolerance at scale. 🗣️ Speaker 1: Adi Polak, Director Developer Experience Engineering and Advocacy, Confluent. Adi is an experienced software engineer and people manager. For most of her professional life, she dealt with data and machine learning for transactional and analytics workloads by building large-scale systems. As a data practitioner, she developed software to solve real-world problems with Apache Spark, Kafka, HDFS, K8s, AWS, and Azure in high-throughput, high-scale production environments for companies like Akamai and Microsoft.Adi has taught Spark to thousands of students throughout the years and is the author of the successful book — Scaling Machine Learning with Spark. When not thinking up new architecture, teaching new tech or pondering on a distributed systems challenge, you can find her at the local cultural scene.	IN PERSON: Apache Kafka meets Apache Flink
AI and Deep Learning for Enterprise #18 2024-08-13 · 17:30 Join us at the Daemon Clubhouse on August 13th for an evening of talks, food, and conversation with ML and AI industry pros. Please note you will be unable to enter the venue before 6.30pm. RSVPs will close 24 hours before the event, you may be unable to register after this time but you can still watch online. If you can't join us in person you can watch remotely via our YouTube channel. Agenda 06:30pm - Doors open, food and drink served 07:00pm - Welcome 07:05pm - A short talk from our hosts Daemon 07:10pm - James Jackson, Associate Director at S-RM, "Using LLMs to make complex assessments without knowledge loss" 07:50pm - Break 08:00 - Tom Scott, Founder and CEO at Streambased, "Enhancing AI with real time data: unlocking contextual insights" 08:40 - Pearl Prakash, Comcast, "Building trip itinerary planner with RAG" 09:00pm - Wrap up, drinks at The Bear Our hosts require that we provide a list of all attendees, please ensure that you register with a name that matches your government issued ID or bank card: if you do not we cannot guarantee you entry to the building. Please RSVP for the event well in advance if you plan to attend in person and unRSVP if you can no longer attend as limited spaces are available.	AI and Deep Learning for Enterprise #18
Apache Kafka meets Apache Druid 2024-07-23 · 17:00 Join us for this special summer event with the opportunity to learn about how Kafka fits into the analytics space. Please note there are only 70 spaces and entry will be strictly limited to first registered. Date and Time: 🗓️ Tuesday 23rd July, ⏰ 18:00 - 20:00 🕘 Venue: Confluent Europe Ltd, 262 High Holborn Floor 1 London WC1V 7EJ · London Attending Brands: Imply, Streambased, Confluent & OSO Networking: Pizza + Beer at the end Schedule: 18:00: Doors Open, Food, drinks, networking 18:30 - 18:35: Welcome and introductions 18:35 - 19:00: “Kafka - No Rocks, Please”, Roman Kolesnev, Confluent 19:05 - 19:35pm: “Real-time plane-spotting”, Peter Marshall and Hugh Evans, Imply 19:35 - 20:00:“Indexing Kafka - Why point look ups matter and how we achieve them”, Tom Scott, Streambased 🎙️ \~Talk 1\~ Talk Title: Kafka - No Rocks, Please! Summary: Using Kafka Streams With Alternative State StoresA popular data transformation library in the Apache Kafka® world - Kafka Streams - is bundled with a high-performance key/value store - RocksDB - with many advantages regarding near real-time, high throughput, and low latency processing. But what if RocksDB key/value store semantics are just not enough? What if the use case requires features of other storage engines - document indexing, graph processing, or simply secondary indexes? Can we employ other embeddable storage engines or extend RocksDB-based state stores?Luckily, Kafka Streams supports custom state store implementations through a pluggable interface. Even better - we can delegate durability, data sharding, and failure recovery to Kafka Streams runtime. Of course, there are some associated restrictions as well. That, and more, is what we will explore in this session. 🗣️ Speaker: Roman Kolesnev is a Staff Customer Innovation Engineer at Confluent in the Customer Solutions & Innovation Division Labs team. His experience includes building business critical event streaming applications and distributed systems in the financial and technology sectors. 🎙️ \~Talk 2\~ Talk Title: Real-time plane-spotting Summary: Hidden from our eyes, aircraft in our skies are constantly transmitting data. Join us as we use some simple tech and the power of open source to fly through this data set. In this talk, see a Raspberry Pi, Apache Kafka, Apache Druid, and Grafana coming together for real-time data production, transport, OLAP, and interactive visualisation. 🗣️ Speaker 1: Peter Marshall is an award-winning speaker who leads developer relations at Imply, having worked with adopters and users of Apache Druid for five years. Peter has worked in enterprise architecture in both public and private sector for over 20 years. He has a BA in Theology and Computer Studies from the University of Birmingham. 🗣️ Speaker 2: Hugh Evans is a Developer Advocate at Imply, helping members of the Apache Druid community and bringing them together. 🎙️ \~Talk 3\~ Talk Title: Indexing Kafka - Why point look ups matter and how we achieve them Summary: Have you ever wanted to know the count of messages on a topic? How about the count of messages with a particular property? Ever spent long hours searching for that poison pill message that killed your applications? All of these cases require looking at historical data in Kafka, something it is traditionally terrible at. Join us in this talk as we explore historical Kafka data interactively and discuss the technologies that can make this possible. 🗣️ Speaker: Tom Scott is the founder of Streambased, Tom is building multi tenant, on-prem and cloud Kafka services to attack common Kafka pain points and break down barriers to starting your data journey.	Apache Kafka meets Apache Druid
Event AI and Deep Learning for Enterprise #14 2024-03-13
Real-world AI: a survey of practitioners 2024-03-13 · 18:30 michael natusch – ex Chief Science Officer @ Prudential Real-world AI: a survey of practitioners. AI/ML
Building models on real-time data with Jupyter and Streambased 2024-03-13 · 18:30 Tom Scott – Founder & CEO @ Streambased Discussion on building models on real-time data using Jupyter notebooks and Streambased. jupyter streambased
Introduction to Geospatial Information Systems 2024-03-13 · 18:30 ramani lachyan – Data Scientist @ Datasparq Overview of geospatial information systems concepts and applications.

IN-PERSON: Kafka Enterprise Strategies: Best Practices 2023-06-26 · 16:45 🔥 Greetings to all Kafka enthusiasts! Following the tremendous achievement of the previous Kafka meet-up in April, we are delighted to declare the forthcoming edition. OSO are teaming up with Sainsbury's, special sponsor for this event, to deliver an exciting program of talks that you won't want to miss! Since this event will take place in a Sainsbury's office, for security purposes, you must register in your ID name order to gain entry. Register from here (scroll down for registry in the event page). When: ⏰ Monday 26th June 2023 5.45 - 8:30 pm Where:📍33 Holborn, London EC4A 1AA Nearest Tube Stations: 🚇 Central Line: St. Pauls & Chancery Lane, Elizabeth Line: Farringdon 🎙️ \~Talk 1\~ Talk Title: Kafka enterprise strategy to evolve with demand Summary: In this talk we are going to navigate through the transformation journey of the Supply Chain BU adopting Kafka with microservices in an event-driven architecture to deliver better value to customers and have a look at the current presence of Kafka in Sainsbury’s. We will go over insights of early days, lessons learnt and successful governance patterns which made the onboarding smoother as time went. We will also review the different tooling involved in making this journey successful not only for business but for developers, along with leveraging Kafka community driven projects such as Kafka Connect. We will conclude on where the Kafka landscape and architecture is headed. Key talk takeaways: Plan for growth with a strong governance Give the tools to developers to be self-sufficient to reduce frictions Leverage existing solutions Breaking down silos to improve business adoption to deliver value faster with consistency Embrace innovative solutions and make the leap towards the future 🗣️ Speaker: John Mille (GitHub @johnpreston) – Principal Cloud Engineer at Sainsbury’s Coming from a Net/Sys admin background, John has been working with AWS (Amazon Web Services) and distributed systems since before he graduated. Introduced to Kafka on the job, he builds Open-Source tooling to help with Kafka Governance automation and enjoys occasional blogging, to share his experiences and help others. 🎙️ \~Talk 2\~ Talk Title: Enforcing Kafka Best Practices Summary: Kafka's flexibility and configurability have contributed to its widespread adoption, but they also pose challenges. The ability to configure applications client-side can lead to inefficiencies and performance issues within the Kafka cluster. Conduktor has encountered numerous instances of such issues reported by their customers, including improperly configured producers, consumer groups with excessive members, and mixed data and schema in topics unintentionally. These issues often go unnoticed by the client application but significantly impact the performance of the Kafka cluster. In this session, join us on a detective hunt as we explore the tools and metrics used to identify misconfigurations in clients. We will discuss strategies for addressing these issues once discovered and provide insights on preventing future occurrences. 🗣️ Speaker: Tom Scott: Principal Engineer, Conduktor Long time enthusiast of Kafka and all things data integration, Tom has more than 10yrs experience (5yrs+ Kafka) in innovative and efficient ways to store, query and move data. Currently working at Conduktor, Tom is building multi-tenant, on-prem and cloud Kafka services to attack common Kafka pain points and break down barriers to starting your data journey. 🗓️ Schedule: The full schedule for the evening is as follows: 17:45 - Doors open** (please arrive on time) 18:00 - Kick-Off 18:10 - Kafka enterprise strategy to evolve with demand + Q&A 18:50 - Break 19:00 - Enforcing Kafka Best Practices + Q&A 19:40 - Food, drinks and networking Catering will be provided to all attendees following the talks. Please inform us of any accessibility or dietary needs in advance. If you have any enquiries contact [email protected] Looking forward to seeing you there! *DISCLAIMER** BY ATTENDING THIS EVENT IN PERSON, you acknowledge that risk includes possible exposure to and illness from infectious diseases including COVID-19, and accept responsibility for this, if it occurs.	IN-PERSON: Kafka Enterprise Strategies: Best Practices
#040: Google BigQuery with Michael Healy 2016-07-05 · 04:30 Michael Healy – data scientist @ Search Discovery , Michael Helbling – host , Tim Wilson – host @ Analytics Power Hour - Columbus (OH In this episode, we dive deep on a 1988 classic: Tom Hanks, under the direction of Penny Marshall, was a 12-year-old in a 30-year-old's body... Actually, that's a different "Big" from what we actually cover in this episode. In this instant classic, the star is BigQuery, the director is Google, and Michael Healy, a data scientist from Search Discovery, delivers an Oscar-worthy performance as Zoltar. In under 48 minutes, Michael (Helbling) and Tim drastically increased their understanding of what Google BigQuery is and where it fits in the analytics landscape. If you'd like to do the same, give it a listen! Technologies, books, and sites referenced in this episode were many, including: Google BigQuery and the BigQuery API Libraries, Google Cloud Services, Google Dremel, Apache Drill, Amazon Redshift (AWS), Rambo III (another 1988 movie!), Hadoop, Cloudera, the Observepoint Tag Debugger, Our Mathematical Universe by Max Tegmark, A Brief History of Time by Stephen Hawking, and a video of math savant Scott Flansburg. Analytics API AWS BigQuery Cloud Computing GCP Hadoop Redshift	The Analytics Power Hour Listen
Oracle Database 12c New Features 2013-12-13 Robert Freeman – author Maximize the New and Improved Features of Oracle Database 12 c Written by Master Principle Database Expert, Oracle, and Oracle ACE Robert G. Freeman, this Oracle Press guide describes the myriad new and enhanced capabilities available in the latest Oracle Database release. Inside, you’ll find everything you need to know to get up and running quickly on Oracle Database 12 c. Supported by running commentary from world-renowned Oracle expert Tom Kyte, and with additional contributions by Oracle experts Eric Yen and Scott Black, Oracle Database 12c New Features offers detailed coverage of: Installing Oracle Database 12 c Architectural changes, such as Oracle Multitenant The most current information on upgrading and migrating to Oracle Database 12 c The pre-upgrade information tool and parallel processing for database upgrades Oracle Real Application Clusters new features, such as Oracle Flex Cluster, Oracle Flex Automatic Storage Management, and Oracle Automatic Storage Management Cluster File System Oracle RMAN enhancements, including cross-platform backup and recovery Oracle Data Guard improvements, such as Fast Sync, and Oracle Active Data Guard new features, such as Far Sync SQL, PL/SQL, DML, and DDL new features Improvements to partitioning manageability, performance, and availability Advanced business intelligence and data warehousing capabilities Security enhancements, including privileges analysis, data redaction, and new administrative-level privileges Manageability, performance, and optimization improvements data data-engineering oracle-database-solutions BI DWH Oracle Cyber Security SQL	O'Reilly Data Engineering Books

Showing 16 results