Search – talk-data.com

Title & Speakers	Event
Scaling Background Noise Filtration for AI Voice Agents 2025-11-08 · 18:55 Stephen Cheng In the world of AI voice agents, especially in sensitive contexts like healthcare, audio clarity is everything. Background noise—a barking dog, a TV, street sounds—degrades transcription accuracy, leading to slower, clunkier, and less reliable AI responses. But how do you solve this in real-time without breaking the bank? This talk chronicles our journey at a health-tech startup to ship background noise filtration at scale. We'll start with the core principles of noise reduction and our initial experiments with open-source models, then dive deep into the engineering architecture required to scale a compute-hungry ML service using Python and Kubernetes. You'll learn about the practical, operational considerations of deploying third-party models and, most importantly, how to measure their true impact on the product. AI/ML Kubernetes Python	PyData Seattle 2025 Video
Deploying ML Models with Kubernetes 2025-10-07 · 14:30 A hands-on introduction to serving machine learning models in production - Alexey Grigorev This is the third workshop in our ML series on ML model deployment. Building on the FastAPI service created in Part 1, we’ll now show how to deploy that service using Kubernetes, the industry standard for managing containerized applications in production. Led by Alexey Grigorev, this workshop focuses on infrastructure, orchestration, and scaling. What You'll Learn How to containerize a model and preprocessing step as microservices How to use Docker Compose to test your setup locally How to deploy your services to Kubernetes How to connect everything together into a working ML system The basics of using EKS (Elastic Kubernetes Service) for managed deployments It will be a live demo with practical tips and a chance to ask your questions. This workshop gives you a real feel for how ML models are deployed in real-world environments. Thinking About ML Zoomcamp? This workshop reflects the updated content of Module 5 in the ML Zoomcamp, giving you a taste of modern ML deployment practices you'll explore in the course. ML Zoomcamp is our free 4-month course that takes you from beginner to advanced ML engineer. It covers the fundamentals of ML, from regression and classification to deployment and deep learning. The new cohort of the ML Zoomcamp starts on September 15, 2025. You can join it by registering here. About the Speaker Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series. Alexey is a seasoned software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS'17 Criteo Challenge. Join our slack: https://datatalks.club/slack.html	Deploying ML Models with Kubernetes
From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra 2025-09-28 · 23:46 Brijesh Tripathi – CEO @ Flex AI , Tobias Macey – host Summary In this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. Join them as they discuss Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay. Pre-amble I hope you enjoy this cross-over episode of the AI Engineering Podcast, another show that I run to act as your guide to the fast-moving world of building scalable and maintainable AI systems. As generative AI models have grown more powerful and are being applied to a broader range of use cases, the lines between data and AI engineering are becoming increasingly blurry. The responsibilities of data teams are being extended into the realm of context engineering, as well as designing and supporting new infrastructure elements that serve the needs of agentic applications. This episode is an example of the types of work that are not easily categorized into one or the other camp. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloadsInterview IntroductionHow did you get involved in machine learning?Can you describe what FlexAI is and the story behind it?What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?Can you describe the design and architecture of the FlexAI platform?How has the implementation evolved from when you first started working on it?For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?What are the elements of AI workloads and applications that you are explicitly not trying to solve for?What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?When is FlexAI the wrong choice?What do you have planned for the future of FlexAI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links Flex AIAurora Super ComputerCoreWeaveKubernetesCUDAROCmTensor Processing Unit (TPU)PyTorchTritonTrainiumASIC == Application Specific Integrated CircuitSOC == System On a ChipLoveableFlexAI BlueprintsTenstorrentThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA AI/ML Aurora Cloud Computing Data Engineering Datafold DevOps ETL/ELT GenAI Kubernetes Prefect Data Streaming	Data Engineering Podcast Listen
Deploying ML Models with AWS Lambda (Serverless) 2025-08-18 · 14:30 A low-cost, auto-scaling approach to serving ML models - Alexey Grigorev This is the second workshop in our ML series on ML model deployment. Building on the FastAPI service created in Part 1, we’ll now show how to deploy that service using Kubernetes, the industry standard for managing containerized applications in production. In this hands-on session, you’ll learn how to deploy machine learning models using serverless infrastructure, specifically AWS Lambda. Serverless deployment is a great alternative to managing your own infrastructure, ideal for lightweight models or infrequent inference. This session walks you through packaging your model, creating a Lambda function, and exposing it via API Gateway. What You'll Learn What serverless means and when to use it How AWS Lambda works for ML model serving How to prepare your model for deployment with TensorFlow Lite How to package your code and dependencies in a Docker image How to expose your Lambda function using API Gateway It will be a live demo with practical tips and a chance to ask your questions. This workshop gives you a real feel for how ML models are deployed in real-world environments. Thinking About ML Zoomcamp? This workshop reflects topics from Module 9 in ML Zoomcamp, our free 4-month course that takes you from beginner to advanced ML engineer. It covers the fundamentals of ML, from regression and classification to deployment and deep learning. The new cohort of the ML Zoomcamp starts on September 15, 2025. You can join it by registering here. About the Speaker Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series. Alexey is a seasoned software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS'17 Criteo Challenge. Join our slack: https://datatalks.club/slack.html	Deploying ML Models with AWS Lambda (Serverless)
PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00 Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members. This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes. The event recording will be immediately available on-demand to accommodate all time zones. Event Highlights: Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development." Themes: Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code. Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions. AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows. Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success. What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more! Register Now See you there!	PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI
PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00 Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members. This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes. The event recording will be immediately available on-demand to accommodate all time zones. Event Highlights: Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development." Themes: Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code. Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions. AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows. Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success. What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more! Register Now See you there!	PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI
PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00 Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members. This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes. The event recording will be immediately available on-demand to accommodate all time zones. Event Highlights: Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development." Themes: Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code. Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions. AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows. Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success. What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more! Register Now See you there!	PulumiUP Conference 2024 \| Platform Engineering, Cloud and IaC, and AI
Global Azure Days London 2024 2024-04-19 · 08:45 Global Azure 2024 is here! Communities around the world are organizing localized hybrid events and live streams for everyone around the world to join and learn about Azure from the best-in-class community leaders. Agenda for London 10:15 AM – Azure AI and Terraform - Exploring our Options by Jake Walsh & Nicholas Chang A look into the options for using Azure Terraform to work with Azure AI Services and Models. Covering an overview of Azure Terraform, Deployment Options, and the types of Azure AI Resources we can manage. 10:45 AM – Unlocking the Potential of Azure Blob Storage: A Guide to the lesser-known Features by John Kilmister In this talk, we will delve into the world of Azure Blob Storage and discover some of its lesser-known features and capabilities. We will explore the ways in which these features can enhance your use of Azure storage solutions. Whether you are a seasoned Azure user or new to the platform, this talk will provide valuable insights and actionable strategies for optimizing your use of Azure Blob Storage. 11:30 AM – Azure Verified Modules by Marcel Lupo In this session we take a closer look at (AVM) Azure Verified Modules. With AVM the aim is to provide a trusted platform for organisations to find and deploy solutions that can be easily integrate with their Azure-based services and infrastructure. They can span a wide range of solutions, using popular infrastructure as code (IaC) tools such as Bicep and Terraform. 12:00 PM – Unlocking the Power of Azure OpenAI: A Comprehensive Exploration and Demonstration by Alpa Buddhabhatti During this concise yet comprehensive session, we'll immerse ourselves in the realm of Azure OpenAI. We will delve into the fundamental principles, capabilities, and practical use cases of this cutting-edge AI technology. Whether you're new to Azure OpenAI or seeking a quick refresher, this session will serve as your gateway to unlock its full potential. In this demonstration, we will focus on the following technologies: 1. Azure OpenAI 2. Azure Data Factory 3. Microsoft Fabric 4. Azure SQL/ LakeHouse 5. Visual studio Code You'll gain valuable insights to effectively grasp and employ Azure OpenAI, paving the way for further exploration and integration into your projects. 12:30 PM – Microsoft Azure Service Bus (MaaS) Integration Simplified by Dias Manjaly The purpose of this foundation session is to demonstrate the usage of Azure Service Bus (MaaS). This session walkthrough the story of a bad customer experience and how it was resolved using Azure Service Bus. It has a short demo based on the story as well. It also covers cool features in Azure Service Bus, High Level Solution Design diagrams, How messages are sent and received using Azure Service Bus and an Azure Function, How to use Azure Service Bus Explorer, Service Bus Quotas and Dynamics 365 CRM Integration with Azure Service Bus, Comparison with other Azure messaging services. No prior knowledge is required to attend this foundation session. Audience will get a clear understanding of Azure Service Bus and it is a very useful skill to have as it a common use case. 1:15 PM – Code, Serverless Containers, and Cloud: Azure Container Apps for Developers by Jonah Andersson Do you want to build and develop serverless containers on a cloud platform such as Microsoft Azure? In this session, you will learn about the important technical concepts you need to know about building and developing Azure containers. The session explores serverlesss containers such Azure Container Apps and its key useful features for cloud development of microservices. Regardless of whether you are new, you will learn how to get started developing with and building your code with ACA and leverage the power of Azure with it. 1:45 PM – .NET on tiny IOT Meadow Boards by Clifford Agius Description It's a fact that DotNET has been around for 20 years and was once just the preserve of Windows, but has in recent years moved to Mobile using Mono and Xamarin, but still big powerful systems and processors. However, thanks to the work of WildernessLabs there is now the Meadow F7 board, a small form factor IOT board based around the Adafruit Feather. This means you can now write your Dot NET C#/F# code and truly run anywhere. The idea of this talk is to show that your existing DotNET skills can be used on IOT platforms without the scary Arduino version of C, you really do now have the skills to write code that will run anywhere. I am just a DotNET dev like you, I don’t work for WildernessLabs and this is not going to be a sales talk about the Meadow system, I backed the Kickstarter and I just enjoy playing with IOT. I want to show you that it’s scary and you too have the skills to diving in and get that LED blinking after that you can automate the home. We'll discuss: - The process of setting up the Meadow board and getting that first Hello World Blinky light going. - Brief explanation of the board and tooling - Demo a more complex system where the board is battery powered measuring sensor values and reporting this to an Azure function for processing. 2:15 PM – Would YOU Survive the Titanic?", with ML and .NET by Simon Painter Have you ever wondered whether YOU would survive the sinking of RMS Titanic or not? In this talk, we'll be using Visual Studio, C# and ML.NET to find out. Machine Learning is a hot topic these days. Business around the world are keen to utilise it; but doesn't it mean learning python?? In fact - no! These days it's not only possible to do ML in Visual Studio, but it's actually easy to produce high quality results with very little effort. Be warned though, there are ice bergs ahead... 3:00 PM – Umbraco Cloud's Journey with Kubernetes, Terraform & Azure DevOps by Dan Lister Dive into the inner workings of Umbraco Cloud's Platform transformation in this talk. See our hands-on approach to deploying and managing cloud infrastructure using Terraform. See how we use of rapid release cycles with Azure DevOps, minimising deployment times while maintaining quality and security. Explore our journey leveraging Azure Kubernetes Services for global scalability and high availability. Witness the integration of Azure API Management in streamlining interactions between our services, enforcing policies, and ensuring a secure, unified API gateway. Gain firsthand insights into how Umbraco Cloud's Platform Team innovatively wields these tools, offering practical knowledge for your own cloud environments. 3:30 PM – Azure Storage Has Never Been So Cool! by Anthony Mashford This session is about the Azure NetApp Files service. The Azure NetApp Files service delivers high-performance, low-latency file storage on-demand. We will cover the features of the service, such as snapshots, backups, replication and the new addition of the cool-access tiering feature. This new tiering feature is a great option to help drive down costs for Azure file storage whilst maintaining the great functional capabilities of Azure NetApp Files. The session will also include a live demonstration of the service and its features and Infrastructure-as-Code deployments.	Global Azure Days London 2024
Accelerate Your Machine Learning With The StreamSQL Feature Store 2020-06-15 · 12:00 Simba Khadder – guest @ StreamSQL , Tobias Macey – host Summary Machine learning is a process driven by iteration and experimentation which requires fast and easy access to relevant features of the data being processed. In order to reduce friction in the process of developing and delivering models there has been a recent trend toward building a dedicated feature. In this episode Simba Khadder discusses his work at StreamSQL building a feature store to make creation, discovery, and monitoring of features fast and easy to manage. He describes the architecture of the system, the benefits of streaming data for machine learning, and how a feature store provides a useful interface between data engineers and machine learning engineers to reduce communication overhead. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise. When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host is Tobias Macey and today I’m interviewing Simba Khadder about his views on the importance of ML feature stores, and his experience implementing one at StreamSQL Interview Introduction How did you get involved in the areas of machine learning and data management? What is StreamSQL and what motivated you to start the business? Can you describe what a machine learning feature is? What is the difference between generating features for training a model and generating features for serving? How is feature management typically handled today? What is a feature store and how is it different from the status quo? What is the overall lifecycle of identifying useful features, defining and generating them, using them for training, and then serving them in production? How does the usage of a feature store impact the workflow of ML engineers/data scientists and data engineers? What are the general requirements of a feature store? What additional capabilities or tangential services are necessary for providing a pleasant UX for a feature store? How is discovery and documentation of features handled? What is the current landscape of feature stores and how does StreamSQL compare? How is the StreamSQL feature store implemented? How is the supporting infrastructure architected and how has it evolved since you first began working on it? Why is streaming data such a focal point of feature stores? How do you generate features for training? How do you approach monitoring of features and what does remediation look like for a feature that is no longer valid? How do you handle versioning and deploying features? What’s the process for integrating data sources into StreamSQL for processing into features? How are the features materialized? What are the most challenging or complex aspects of working on or with a feature store? When is StreamSQL the wrong choice for a feature store? What are the most interesting, challenging, or unexpected lessons that you have learned in the process of building StreamSQL? What do you have planned for the future of the produ AI/ML Data Engineering Data Management Kubernetes Data Streaming	Data Engineering Podcast Listen

Scaling Background Noise Filtration for AI Voice Agents 2025-11-08 · 18:55

Stephen Cheng

In the world of AI voice agents, especially in sensitive contexts like healthcare, audio clarity is everything. Background noise—a barking dog, a TV, street sounds—degrades transcription accuracy, leading to slower, clunkier, and less reliable AI responses. But how do you solve this in real-time without breaking the bank?

This talk chronicles our journey at a health-tech startup to ship background noise filtration at scale. We'll start with the core principles of noise reduction and our initial experiments with open-source models, then dive deep into the engineering architecture required to scale a compute-hungry ML service using Python and Kubernetes. You'll learn about the practical, operational considerations of deploying third-party models and, most importantly, how to measure their true impact on the product.

AI/ML Kubernetes Python

PyData Seattle 2025

Video

Deploying ML Models with Kubernetes 2025-10-07 · 14:30

A hands-on introduction to serving machine learning models in production - Alexey Grigorev

This is the third workshop in our ML series on ML model deployment. Building on the FastAPI service created in Part 1, we’ll now show how to deploy that service using Kubernetes, the industry standard for managing containerized applications in production. Led by Alexey Grigorev, this workshop focuses on infrastructure, orchestration, and scaling.

What You'll Learn

How to containerize a model and preprocessing step as microservices
How to use Docker Compose to test your setup locally
How to deploy your services to Kubernetes
How to connect everything together into a working ML system
The basics of using EKS (Elastic Kubernetes Service) for managed deployments

It will be a live demo with practical tips and a chance to ask your questions. This workshop gives you a real feel for how ML models are deployed in real-world environments. Thinking About ML Zoomcamp? This workshop reflects the updated content of Module 5 in the ML Zoomcamp, giving you a taste of modern ML deployment practices you'll explore in the course.

ML Zoomcamp is our free 4-month course that takes you from beginner to advanced ML engineer. It covers the fundamentals of ML, from regression and classification to deployment and deep learning. The new cohort of the ML Zoomcamp starts on September 15, 2025. You can join it by registering here. About the Speaker

Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series. Alexey is a seasoned software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS'17 Criteo Challenge.

Join our slack: https://datatalks.club/slack.html

Deploying ML Models with Kubernetes

From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra 2025-09-28 · 23:46

Brijesh Tripathi – CEO @ Flex AI , Tobias Macey – host

Summary In this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. Join them as they discuss Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay.

Pre-amble I hope you enjoy this cross-over episode of the AI Engineering Podcast, another show that I run to act as your guide to the fast-moving world of building scalable and maintainable AI systems. As generative AI models have grown more powerful and are being applied to a broader range of use cases, the lines between data and AI engineering are becoming increasingly blurry. The responsibilities of data teams are being extended into the realm of context engineering, as well as designing and supporting new infrastructure elements that serve the needs of agentic applications. This episode is an example of the types of work that are not easily categorized into one or the other camp.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloadsInterview IntroductionHow did you get involved in machine learning?Can you describe what FlexAI is and the story behind it?What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?Can you describe the design and architecture of the FlexAI platform?How has the implementation evolved from when you first started working on it?For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?What are the elements of AI workloads and applications that you are explicitly not trying to solve for?What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?When is FlexAI the wrong choice?What do you have planned for the future of FlexAI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links Flex AIAurora Super ComputerCoreWeaveKubernetesCUDAROCmTensor Processing Unit (TPU)PyTorchTritonTrainiumASIC == Application Specific Integrated CircuitSOC == System On a ChipLoveableFlexAI BlueprintsTenstorrentThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

AI/ML Aurora Cloud Computing Data Engineering Datafold DevOps ETL/ELT GenAI Kubernetes Prefect Data Streaming

Data Engineering Podcast

Listen

Deploying ML Models with AWS Lambda (Serverless) 2025-08-18 · 14:30

A low-cost, auto-scaling approach to serving ML models - Alexey Grigorev

This is the second workshop in our ML series on ML model deployment.

Building on the FastAPI service created in Part 1, we’ll now show how to deploy that service using Kubernetes, the industry standard for managing containerized applications in production.

In this hands-on session, you’ll learn how to deploy machine learning models using serverless infrastructure, specifically AWS Lambda. Serverless deployment is a great alternative to managing your own infrastructure, ideal for lightweight models or infrequent inference.

This session walks you through packaging your model, creating a Lambda function, and exposing it via API Gateway.

What You'll Learn

What serverless means and when to use it
How AWS Lambda works for ML model serving
How to prepare your model for deployment with TensorFlow Lite
How to package your code and dependencies in a Docker image
How to expose your Lambda function using API Gateway

It will be a live demo with practical tips and a chance to ask your questions. This workshop gives you a real feel for how ML models are deployed in real-world environments.

Thinking About ML Zoomcamp? This workshop reflects topics from Module 9 in ML Zoomcamp, our free 4-month course that takes you from beginner to advanced ML engineer. It covers the fundamentals of ML, from regression and classification to deployment and deep learning.

The new cohort of the ML Zoomcamp starts on September 15, 2025. You can join it by registering here.

About the Speaker Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series.

Alexey is a seasoned software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS'17 Criteo Challenge.

Join our slack: https://datatalks.club/slack.html

Deploying ML Models with AWS Lambda (Serverless)

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00

Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members.

This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes.

The event recording will be immediately available on-demand to accommodate all time zones.

Event Highlights:

Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products
Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML
Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development."

Themes:

Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code.
Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions.
AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows.
Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success.

What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more!

Register Now See you there!

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00

Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members.

This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes.

The event recording will be immediately available on-demand to accommodate all time zones.

Event Highlights:

Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products
Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML
Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development."

Themes:

Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code.
Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions.
AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows.
Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success.

What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more!

Register Now See you there!

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI 2024-09-18 · 15:00

Mark your calendars for PulumiUP 2024! Attend to learn about the latest trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud ecosystems, AI/ML, and cloud culture from industry leaders and community members.

This event will be streamed live. You will be able to interact with the speakers and the Pulumi team and ask questions. You will also have the opportunity to win some exclusive PulumiUP prizes.

The event recording will be immediately available on-demand to accommodate all time zones.

Event Highlights:

Keynotes and Product Demos: Gain insights from Pulumi Co-Founders during the opening keynote and discover the new products
Top Tracks: Learn trends, best practices, and lessons learned in Platform Engineering / DevOps, cloud and Infrastructure as Code (IaC), and AI/ML
Panel Discussions: Engage with industry leaders who will discuss "Secrets and Policies - Automating Cybersecurity," "Infrastructure as Code - Can We Do Better?" and "AI for Cloud Development."

Themes:

Platform Engineering & DevOps: Dive into essential topics such as streamlined CI/CD practices, deployment with GitOps, automating infrastructure operations, optimizing self-service workflows, IDPs on Kubernetes, security and governance with Policies-as-code.
Cloud and IaC: Explore the world of cloud computing, including infrastructure as code, scalable architectures, cloud service integration, and technologies like AWS, Azure, Google Cloud, Kubernetes, Docker, and serverless solutions.
AI / ML: Dive into AI and ML best practices, tools, and techniques for implementing algorithms, deploying models, and optimizing workflows.
Cloud Culture: Focus on the human side of the tech industry, from team building to developer experience (DevEx), and collaboration can enhance productivity, innovation, and personal and business success.

What is PulumiUP? PulumiUP is a one-day virtual event bringing together engineers and developers of all expertise levels. Expect new product releases, special guest speakers, tech talks, hands-on workshops, and more!

Register Now See you there!

PulumiUP Conference 2024 | Platform Engineering, Cloud and IaC, and AI

Global Azure Days London 2024 2024-04-19 · 08:45

Global Azure 2024 is here! Communities around the world are organizing localized hybrid events and live streams for everyone around the world to join and learn about Azure from the best-in-class community leaders.

Agenda for London

10:15 AM – Azure AI and Terraform - Exploring our Options by Jake Walsh & Nicholas Chang A look into the options for using Azure Terraform to work with Azure AI Services and Models. Covering an overview of Azure Terraform, Deployment Options, and the types of Azure AI Resources we can manage.

10:45 AM – Unlocking the Potential of Azure Blob Storage: A Guide to the lesser-known Features by John Kilmister In this talk, we will delve into the world of Azure Blob Storage and discover some of its lesser-known features and capabilities. We will explore the ways in which these features can enhance your use of Azure storage solutions. Whether you are a seasoned Azure user or new to the platform, this talk will provide valuable insights and actionable strategies for optimizing your use of Azure Blob Storage.

11:30 AM – Azure Verified Modules by Marcel Lupo In this session we take a closer look at (AVM) Azure Verified Modules. With AVM the aim is to provide a trusted platform for organisations to find and deploy solutions that can be easily integrate with their Azure-based services and infrastructure. They can span a wide range of solutions, using popular infrastructure as code (IaC) tools such as Bicep and Terraform.

12:00 PM – Unlocking the Power of Azure OpenAI: A Comprehensive Exploration and Demonstration by Alpa Buddhabhatti During this concise yet comprehensive session, we'll immerse ourselves in the realm of Azure OpenAI. We will delve into the fundamental principles, capabilities, and practical use cases of this cutting-edge AI technology. Whether you're new to Azure OpenAI or seeking a quick refresher, this session will serve as your gateway to unlock its full potential. In this demonstration, we will focus on the following technologies: 1. Azure OpenAI 2. Azure Data Factory 3. Microsoft Fabric 4. Azure SQL/ LakeHouse 5. Visual studio Code You'll gain valuable insights to effectively grasp and employ Azure OpenAI, paving the way for further exploration and integration into your projects.

12:30 PM – Microsoft Azure Service Bus (MaaS) Integration Simplified by Dias Manjaly The purpose of this foundation session is to demonstrate the usage of Azure Service Bus (MaaS). This session walkthrough the story of a bad customer experience and how it was resolved using Azure Service Bus. It has a short demo based on the story as well. It also covers cool features in Azure Service Bus, High Level Solution Design diagrams, How messages are sent and received using Azure Service Bus and an Azure Function, How to use Azure Service Bus Explorer, Service Bus Quotas and Dynamics 365 CRM Integration with Azure Service Bus, Comparison with other Azure messaging services. No prior knowledge is required to attend this foundation session. Audience will get a clear understanding of Azure Service Bus and it is a very useful skill to have as it a common use case.

1:15 PM – Code, Serverless Containers, and Cloud: Azure Container Apps for Developers by Jonah Andersson Do you want to build and develop serverless containers on a cloud platform such as Microsoft Azure? In this session, you will learn about the important technical concepts you need to know about building and developing Azure containers. The session explores serverlesss containers such Azure Container Apps and its key useful features for cloud development of microservices. Regardless of whether you are new, you will learn how to get started developing with and building your code with ACA and leverage the power of Azure with it.

1:45 PM – .NET on tiny IOT Meadow Boards by Clifford Agius Description It's a fact that DotNET has been around for 20 years and was once just the preserve of Windows, but has in recent years moved to Mobile using Mono and Xamarin, but still big powerful systems and processors. However, thanks to the work of WildernessLabs there is now the Meadow F7 board, a small form factor IOT board based around the Adafruit Feather. This means you can now write your Dot NET C#/F# code and truly run anywhere. The idea of this talk is to show that your existing DotNET skills can be used on IOT platforms without the scary Arduino version of C, you really do now have the skills to write code that will run anywhere. I am just a DotNET dev like you, I don’t work for WildernessLabs and this is not going to be a sales talk about the Meadow system, I backed the Kickstarter and I just enjoy playing with IOT. I want to show you that it’s scary and you too have the skills to diving in and get that LED blinking after that you can automate the home. We'll discuss: - The process of setting up the Meadow board and getting that first Hello World Blinky light going. - Brief explanation of the board and tooling - Demo a more complex system where the board is battery powered measuring sensor values and reporting this to an Azure function for processing.

2:15 PM – Would YOU Survive the Titanic?", with ML and .NET by Simon Painter Have you ever wondered whether YOU would survive the sinking of RMS Titanic or not? In this talk, we'll be using Visual Studio, C# and ML.NET to find out. Machine Learning is a hot topic these days. Business around the world are keen to utilise it; but doesn't it mean learning python?? In fact - no! These days it's not only possible to do ML in Visual Studio, but it's actually easy to produce high quality results with very little effort. Be warned though, there are ice bergs ahead...

3:00 PM – Umbraco Cloud's Journey with Kubernetes, Terraform & Azure DevOps by Dan Lister Dive into the inner workings of Umbraco Cloud's Platform transformation in this talk. See our hands-on approach to deploying and managing cloud infrastructure using Terraform. See how we use of rapid release cycles with Azure DevOps, minimising deployment times while maintaining quality and security. Explore our journey leveraging Azure Kubernetes Services for global scalability and high availability. Witness the integration of Azure API Management in streamlining interactions between our services, enforcing policies, and ensuring a secure, unified API gateway. Gain firsthand insights into how Umbraco Cloud's Platform Team innovatively wields these tools, offering practical knowledge for your own cloud environments.

3:30 PM – Azure Storage Has Never Been So Cool! by Anthony Mashford This session is about the Azure NetApp Files service. The Azure NetApp Files service delivers high-performance, low-latency file storage on-demand. We will cover the features of the service, such as snapshots, backups, replication and the new addition of the cool-access tiering feature. This new tiering feature is a great option to help drive down costs for Azure file storage whilst maintaining the great functional capabilities of Azure NetApp Files. The session will also include a live demonstration of the service and its features and Infrastructure-as-Code deployments.

Global Azure Days London 2024

Accelerate Your Machine Learning With The StreamSQL Feature Store 2020-06-15 · 12:00

Simba Khadder – guest @ StreamSQL , Tobias Macey – host

Summary Machine learning is a process driven by iteration and experimentation which requires fast and easy access to relevant features of the data being processed. In order to reduce friction in the process of developing and delivering models there has been a recent trend toward building a dedicated feature. In this episode Simba Khadder discusses his work at StreamSQL building a feature store to make creation, discovery, and monitoring of features fast and easy to manage. He describes the architecture of the system, the benefits of streaming data for machine learning, and how a feature store provides a useful interface between data engineers and machine learning engineers to reduce communication overhead.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise. When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host is Tobias Macey and today I’m interviewing Simba Khadder about his views on the importance of ML feature stores, and his experience implementing one at StreamSQL

Interview

Introduction How did you get involved in the areas of machine learning and data management? What is StreamSQL and what motivated you to start the business? Can you describe what a machine learning feature is? What is the difference between generating features for training a model and generating features for serving? How is feature management typically handled today? What is a feature store and how is it different from the status quo? What is the overall lifecycle of identifying useful features, defining and generating them, using them for training, and then serving them in production? How does the usage of a feature store impact the workflow of ML engineers/data scientists and data engineers? What are the general requirements of a feature store? What additional capabilities or tangential services are necessary for providing a pleasant UX for a feature store?

How is discovery and documentation of features handled?

What is the current landscape of feature stores and how does StreamSQL compare? How is the StreamSQL feature store implemented?

How is the supporting infrastructure architected and how has it evolved since you first began working on it?

Why is streaming data such a focal point of feature stores? How do you generate features for training? How do you approach monitoring of features and what does remediation look like for a feature that is no longer valid? How do you handle versioning and deploying features? What’s the process for integrating data sources into StreamSQL for processing into features? How are the features materialized? What are the most challenging or complex aspects of working on or with a feature store? When is StreamSQL the wrong choice for a feature store? What are the most interesting, challenging, or unexpected lessons that you have learned in the process of building StreamSQL? What do you have planned for the future of the produ

AI/ML Data Engineering Data Management Kubernetes Data Streaming

Data Engineering Podcast

Listen

Activities & events