MLOps

Data Engineering for Multimodal AI

2026-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vasundra Srinivasan

AI/ML Cloud Computing Data Engineering Data Governance ETL/ELT Cyber Security data data-engineering

A shift is underway in how organizations approach data infrastructure for AI-driven transformation. As multimodal AI systems and applications become increasingly sophisticated and data hungry, data systems must evolve to meet these complex demands. Data Engineering for Multimodal AI is one of the first practical guides for data engineers, machine learning engineers, and MLOps specialists looking to rapidly master the skills needed to build robust, scalable data infrastructures for multimodal AI systems and applications. You'll follow the entire lifecycle of AI-driven data engineering, from conceptualizing data architectures to implementing data pipelines optimized for multimodal learning in both cloud native and on-premises environments. And each chapter includes step-by-step guides and best practices for implementing key concepts. Design and implement cloud native data architectures optimized for multimodal AI workloads Build efficient and scalable ETL processes for preparing diverse AI training data Implement real-time data processing pipelines for multimodal AI inference Develop and manage feature stores that support multiple data modalities Apply data governance and security practices specific to multimodal AI projects Optimize data storage and retrieval for various types of multimodal ML models Integrate data versioning and lineage tracking in multimodal AI workflows Implement data-quality frameworks to ensure reliable outcomes across data types Design data pipelines that support responsible AI practices in a multimodal context

Learning AutoML

2026-04-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Kerem Tomak

AI/ML Airflow Azure Amazon SageMaker ai-ml automl data hyperparameter-tuning machine-learning machine-learning-methods

Learning AutoML is your practical guide to applying automated machine learning in real-world environments. Whether you're a data scientist, ML engineer, or AI researcher, this book helps you move beyond experimentation to build and deploy high-performing models with less manual tuning and more automation. Using AutoGluon as a primary toolkit, you'll learn how to build, evaluate, and deploy AutoML models that reduce complexity and accelerate innovation. Author Kerem Tomak shares insights on how to integrate models into end-to-end deployment workflows using popular tools like Kubeflow, MLflow, and Airflow, while exploring cross-platform approaches with Vertex AI, SageMaker Autopilot, Azure AutoML, Auto-sklearn, and H2O.ai. Real-world case studies highlight applications across finance, healthcare, and retail, while chapters on ethics, governance, and agentic AI help future-proof your knowledge. Build AutoML pipelines for tabular, text, image, and time series data Deploy models with fast, scalable workflows using MLOps best practices Compare and navigate today's leading AutoML platforms Interpret model results and make informed decisions with explainability tools Explore how AutoML leads into next-gen agentic AI systems

Generative AI on Kubernetes

2026-03-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Roland Huß , Daniele Zonca

AI/ML Cloud Computing GenAI Kubernetes LLM Cyber Security ai-ml artificial-intelligence-ai data generative-ai

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to unlock AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way. With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively. Learn to run GenAI models on Kubernetes for efficient scalability Get techniques to train and fine-tune LLMs within Kubernetes environments See how to deploy production-ready AI systems with automation and resource optimization Discover how to monitor and scale GenAI applications to handle real-world demand Uncover the best tools to operationalize your GenAI workloads Learn how to run agent-based and AI-driven applications

The AI Optimization Playbook

2025-11-28 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Dr. Chun Schiros , Dr. Usha Jagannathan , Rajdeep Arora , Supreet Kaur (DataBuzz)

AI/ML GenAI ai-governance ai-ml artificial-intelligence-ai data

Deliver measurable business value by applying strategic, technical, and ethical frameworks to AI initiatives at scale Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Build AI strategies that align with business goals and maximize ROI Implement enterprise-ready frameworks for MLOps, LLMOps, and Responsible AI Learn from real-world case studies spanning industries and AI maturity levels Book Description AI is only as valuable as the business outcomes it enables, and this hands-on guide shows you how to make that happen. Whether you’re a technology leader launching your first AI use case or scaling production systems, you need a clear path from innovation to impact. That means aligning your AI initiatives with enterprise strategy, operational readiness, and responsible practices, and The AI Optimization Playbook gives you the clarity, structure, and insight you need to succeed. Through actionable guidance and real-world examples, you’ll learn how to build high-impact AI strategies, evaluate projects based on ROI, secure executive sponsorship, and transition prototypes into production-grade systems. You’ll also explore MLOps and LLMOps practices that ensure scalability, reliability, and governance across the AI lifecycle. But deployment is just the beginning. This book goes further to address the crucial need for Responsible AI through frameworks, compliance strategies, and transparency techniques. Written by AI experts and industry leaders, this playbook combines technical fluency with strategic perspective to bridge the business–technology divide so you can confidently lead AI transformation across the enterprise. Email sign-up and proof of purchase required What you will learn Design business-aligned AI strategies Select and prioritize AI projects with the highest potential ROI Develop reliable prototypes and scale them using MLOps pipelines Integrate explainability, fairness, and compliance into AI systems Apply LLMOps practices to deploy and maintain generative AI models Build AI agents that support autonomous decision-making at scale Navigate evolving AI regulations with actionable compliance frameworks Build a future-ready, ethically grounded AI organization Who this book is for This book is for AI/ML leaders and business leaders, CTOs, CIOs, CDAOs, and CAIOs, responsible for driving innovation, operational efficiency, and risk mitigation through artificial intelligence. You should have familiarity with enterprise technology and the fundamentals of AI solution development.

Functional Reproducibility

2025-11-19 · PyData Berlin 2025 November Meetup

talk

by Robin Gower

Have you ever written the perfect data analysis? Has it still ran unchanged 6 months later? Can your colleagues run it without you? Just because your analysis is executable, it doesn’t mean the results are reproducible. Data ages. Libraries change. Machines differ. Servers go down. Bits rot. Entropy is inescapable. We can learn how to engineer reproducibility by drawing on techniques from functional programming and the MLOps movement.

Operationalizing Responsible AI and Data Science in Healthcare with Nasibeh Zanirani Farahani

2025-11-12 · Women in AI and Data Science Conference 2025 Watch

video

by Nasibeh Zanirani Farahani

AI/ML Data Quality Data Science

As healthcare organizations accelerate their adoption of AI and data-driven systems, the challenge lies not only in innovation but in responsibly scaling these technologies within clinical and operational workflows. This session examines the technical and governance frameworks required to translate AI research into reliable and compliant real-world applications. We will explore best practices in model lifecycle management, data quality assurance, bias detection, regulatory alignment, and human-in-the-loop validation, grounded in lessons from implementing AI solutions across complex healthcare environments. Emphasizing cross-functional collaboration among clinicians, data scientists, and business leaders, the session highlights how to balance technical rigor with clinical relevance and ethical accountability. Attendees will gain actionable insights into building trustworthy AI pipelines, integrating MLOps principles in regulated settings, and delivering measurable improvements in patient care, efficiency, and organizational learning.

Building Machine Learning Systems with a Feature Store

2025-11-07 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Jim Dowling

AI/ML Data Modelling LLM RAG ai-ml data machine-learning

Get up to speed on a new unified approach to building machine learning (ML) systems with a feature store. Using this practical book, data scientists and ML engineers will learn in detail how to develop and operate batch, real-time, and agentic ML systems. Author Jim Dowling introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. You'll see how any AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, you'll tackle the hardest part of ML systems--the data, learning how to transform data into features and embeddings, and how to design a data model for AI. Develop batch ML systems at any scale Develop real-time ML systems by shifting left or shifting right feature computation Develop agentic ML systems that use LLMs, tools, and retrieval-augmented generation Understand and apply MLOps principles when developing and operating ML systems

#328 The Challenges of Enterprise Agentic AI with Manasi Vartak, Chief AI Architect at Cloudera

2025-10-27 · DataFramed Listen

podcast_episode

by Manasi Vartak (Cloudera) , Richie (DataCamp)

AI/ML Cloud Computing Computer Science Data Governance GenAI Microsoft RAG Cyber Security

The promise of AI in enterprise settings is enormous, but so are the privacy and security challenges. How do you harness AI's capabilities while keeping sensitive data protected within your organization's boundaries? Private AI—using your own models, data, and infrastructure—offers a solution, but implementation isn't straightforward. What governance frameworks need to be in place? How do you evaluate non-deterministic AI systems? When should you build in-house versus leveraging cloud services? As data and software teams evolve in this new landscape, understanding the technical requirements and workflow changes is essential for organizations looking to maintain control over their AI destiny. Manasi Vartak is Chief AI Architect and VP of Product Management (AI Platform) at Cloudera. She is a product and AI leader with more than a decade of experience at the intersection of AI infrastructure, enterprise software, and go-to-market strategy. At Cloudera, she leads product and engineering teams building low-code and high-code generative AI platforms, driving the company’s enterprise AI strategy and enabling trusted AI adoption across global organizations. Before joining Cloudera through its acquisition of Verta, Manasi was the founder and CEO of Verta, where she transformed her MIT research into enterprise-ready ML infrastructure. She scaled the company to multi-million ARR, serving Fortune 500 clients in finance, insurance, and capital markets, and led the launch of enterprise MLOps and GenAI products used in mission-critical workloads. Manasi earned her PhD in Computer Science from MIT, where she pioneered model management systems such as ModelDB — foundational work that influenced the development of tools like MLflow. Earlier in her career, she held research and engineering roles at Twitter, Facebook, Google, and Microsoft. In the episode, Richie and Manasi explore AI's role in financial services, the challenges of AI adoption in enterprises, the importance of data governance, the evolving skills needed for AI development, the future of AI agents, and much more. Links Mentioned in the Show: ClouderaCloudera Evolve ConferenceCloudera Agent StudioConnect with ManasiCourse: Introduction to AI AgentsRelated Episode: RAG 2.0 and The New Era of RAG Agents with Douwe Kiela, CEO at Contextual AI & Adjunct Professor at Stanford UniversityRewatch RADAR AI New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Feature feast: Building a data buffet for models

2025-10-15 · dbt Coalesce 2025 Watch

talk

by Ryan Kaplan (Fifth Third Bank) , Shanna Anderson (Fifth Third Bank)

Cloud Computing dbt

Fifth Third Bank transformed its MLOps using a feature store built with dbt Cloud. The result? Improved model governance, reduced risk and faster product innovation. In this session, learn how they defined features, automated pipelines and delivered real-time and historical feature views.

Best Practices for Scaling End-to-End ML Workloads in Snowflake ML

2025-10-09 · Snowflake World Tour London

session

AI/ML Git Snowflake

Snowflake ML enables efficient development and deployment of advanced models without any data movement. With multi-GPU support, MLOps integration and Git-based workflows, Container Runtime provides a scalable environment for training, and Snowflake ML’s products such as Model Registry and Model Serving make it easy to deploy these models in production. This session explores best practices for scalable ML workflows and the creation of production-ready ML pipelines in Snowflake.

Best Practices for Scaling End-to-End ML Workloads in Snowflake ML

2025-10-01 · Snowflake World Tour Berlin

session

AI/ML Git Snowflake

Snowflake ML enables efficient development and deployment of advanced models without any data movement. With multi-GPU support, MLOps integration and Git-based workflows, Container Runtime provides a scalable environment for training, and Snowflake ML’s products such as Model Registry and Model Serving make it easy to deploy these models in production. This session explores best practices for scalable ML workflows and the creation of production-ready ML pipelines in Snowflake.

HP Z Centre de création IA: une suite d'outils ML Ops

2025-10-01 · Big Data & AI Paris 2025

Face To Face

by Nigel WILSON (HP) , Cécile Tezenas Du Montcel

AI/ML

HP Z Centre de création IA : une suite d'outils ML Ops

•Collaborer et centraliser vos modèles avec HP AI Studio

•Outil d'aide et d'optimisation au développement

Les librairies et micro-services NVIDIA – un ensemble d’outils et de micro-services disponibles sur les stations de travail HP pour faciliter le développement en IA.

Exploitez la puissance du GPU pour les tâches d’apprentissage automatique avec HP Zboost.

Building Resilient (ML) Pipelines for MLOps

2025-10-01 · PyData Paris 2025 Watch

talk

by Lex Avstreikh (Hopsworks)

AI/ML LLM

This talk explores the disconnect between MLOps fundamental principles and their practical application in designing, operating and maintaining machine learning pipelines. We’ll break down these principles, examine their influence on pipeline architecture, and conclude with a straightforward, vendor-agnostic mind-map, offering a roadmap to build resilient MLOps systems for any project or technology stack. Despite the surge in tools and platforms, many teams still struggle with the same underlying issues: brittle data dependencies, poor observability, unclear ownership, and pipelines that silently break once deployed. Architecture alone isn't the answer — systems thinking is.

We'll use concrete examples to walk through common failure modes in ML pipelines, highlight where analogies fall apart, and show how to build systems that tolerate failure, adapt to change, and support iteration without regressions.

Topics covered include: - Common failure modes in ML pipelines - Modular design: feature, training, inference - Built-in observability, versioning, reuse - Orchestration across batch, real-time, LLMs - Platform-agnostic patterns that scale

Key takeaways: - Resilience > diagrams - Separate concerns, embrace change - Metadata is your backbone - Infra should support iteration, not block it

Berlin PyData 2025 Conference Interviews

2025-09-26 · DataTalks.Club Listen

podcast_episode

by Yashasvi Misra (Pure Storage) , Igor Kvachenok (Leuphana University of Lüneburg) , Selim Nowicki (Distill Labs) , Mehdi Ouazza (MotherDuck) , Gülsah Durmaz

AI/ML Analytics Data Engineering Data Science dbt DuckDB Kubernetes LLM Motherduck Python

At PyData Berlin, community members and industry voices highlighted how AI and data tooling are evolving across knowledge graphs, MLOps, small-model fine-tuning, explainability, and developer advocacy.

Igor Kvachenok (Leuphana University / ProKube) combined knowledge graphs with LLMs for structured data extraction in the polymer industry, and noted how MLOps is shifting toward LLM-focused workflows.
Selim Nowicki (Distill Labs) introduced a platform that uses knowledge distillation to fine-tune smaller models efficiently, making model specialization faster and more accessible.
Gülsah Durmaz (Architect & Developer) shared her transition from architecture to coding, creating Python tools for design automation and volunteering with PyData through PyLadies.
Yashasvi Misra (Pure Storage) spoke on explainable AI, stressing accountability and compliance, and shared her perspective as both a data engineer and active Python community organizer.
Mehdi Ouazza (MotherDuck) reflected on developer advocacy through video, workshops, and branding, showing how creative communication boosts adoption of open-source tools like DuckDB.

Igor Kvachenok Master’s student in Data Science at Leuphana University of Lüneburg, writing a thesis on LLM-enhanced data extraction for the polymer industry. Builds RDF knowledge graphs from semi-structured documents and works at ProKube on MLOps platforms powered by Kubeflow and Kubernetes.

Connect: https://www.linkedin.com/in/igor-kvachenok/

Selim Nowicki Founder of Distill Labs, a startup making small-model fine-tuning simple and fast with knowledge distillation. Previously led data teams at Berlin startups like Delivery Hero, Trade Republic, and Tier Mobility. Sees parallels between today’s ML tooling and dbt’s impact on analytics.

Connect: https://www.linkedin.com/in/selim-nowicki/

Gülsah Durmaz Architect turned developer, creating Python-based tools for architectural design automation with Rhino and Grasshopper. Active in PyLadies and a volunteer at PyData Berlin, she values the community for networking and learning, and aims to bring ML into architecture workflows.

Connect: https://www.linkedin.com/in/gulsah-durmaz/

Yashasvi (Yashi) Misra Data Engineer at Pure Storage, community organizer with PyLadies India, PyCon India, and Women Techmakers. Advocates for inclusive spaces in tech and speaks on explainable AI, bridging her day-to-day in data engineering with her passion for ethical ML.

Connect: https://www.linkedin.com/in/misrayashasvi/

Mehdi Ouazza Developer Advocate at MotherDuck, formerly a data engineer, now focused on building community and education around DuckDB. Runs popular YouTube channels ("mehdio DataTV" and "MotherDuck") and delivered a hands-on workshop at PyData Berlin. Blends technical clarity with creative storytelling.

Connect: https://www.linkedin.com/in/mehd-io/

Brian Cox Stream - AI, Data Science & MLOps Theatre

2025-09-25 · Session spares for Mercury scanners

Face To Face

AI/ML Data Science

Automating MLOps Pipelines for Energy Flexibility

2025-09-25 · Big Data LDN 2025

Face To Face

by Adam Sroka (Hypercube) , Steve Sinclair (Flexitricity)

AI/ML Data Modelling Data Science

Energy flexibility is playing an increasingly fundamental role in the UK energy market. With the adoption of renewable energy sources such as EVs, solar panels and domestic and commercial batteries, the number of flexible assets is soaring - making aggregation and flexibility trading infinitely more complex and requiring vast amounts of data modelling and forecasting. To address this challenge, Flexitricity adopted MLOps best practices to tackle this complex real-world challenge and meet the needs of the scaling energy demand in the UK.

The session will cover:

- The complex technical challenge of energy flexibility in 2025.

- The critical requirement to invest in technology and skillsets.

- A real-life view of how machine learning operations (MLOps) scaled Flexitricity’s data science model development.

- How innovations in technology can support and optimise delivering on energy flexibility.

The audience will gain insight into:

- The challenge of building data science models to keep up with scaling demand.

- How MLOps best practices can be adopted to drive efficiency and increase data science experiments to 10000+ per year.

- Lessons learned from adopting MLOps pipelines.

Continuous monitoring of model drift in the financial sector

2025-09-25 · PyData Amsterdam 2025 Watch

talk

by Agustin Iniguez , Denis Gaitan

AI/ML

In today’s financial sector, the continuous accuracy and reliability of machine learning models are crucial for operational efficiency and effective risk management. With the rise of MLOps (Machine Learning Operations), automating monitoring mechanisms has become essential to ensure model performance and compliance with regulations. This presentation introduces a method for continuous monitoring of model drift, highlighting the benefits of automation within the MLOps framework. This topic is particularly interesting because it addresses a common challenge in maintaining model performance over time and demonstrates a practical solution that has been successfully implemented in the bank.

This talk is aimed at data scientists, machine learning engineers, and MLOps practitioners who are interested in automating the monitoring of machine learning models. Attendees will be guided on how to continuous monitor model drift within the MLOps framework. They will understand the benefits of automation in this context, and gain insights into MLOps best practices. A basic understanding of MLOps principles, and statistical techniques for model evaluation will be helpful but not strictly needed.

The presentation will be an informative talk with a focus on the design and implementation. It will include some mathematical concepts but will primarily be demonstrating real-world applications and best practices. At the end we encourage you to actively monitor model drift and automate your monitoring processes to enhance model accuracy, scalability, and compliance in your organizations.

Ops Overload? From MLOps to LLMOps with One Platform

2025-09-25 · Big Data LDN 2025

Face To Face

by Stephanie Anani (Google Cloud)

AI/ML Cloud Computing GCP GenAI LLM

The Generative AI revolution is here, but so is the operational headache. For years, teams have matured their MLOps practices for traditional models, but the rapid adoption of LLMs has introduced a parallel, often chaotic, world of LLMOps. This results in fragmented toolchains, duplicated effort, and a state of "Ops Overload" that slows down innovation.

This session directly confronts this challenge. We will demonstrate how a unified platform like Google Cloud's Vertex AI can tame this complexity by providing a single control plane for the entire AI lifecycle.

Ops Overload? From MLOps to LLMOps with One Platform

2025-09-25 · Big Data LDN 2025

Face To Face

by Google Cloud , Stephanie Anani AI Customer Engineer

HTML

https://www.bigdataldn.com/en-gb/conference/session-details.4500.251877.ops-overload-from-mlops-to-llmops-with-one-platform.html

Keynote Stream - AI, Data Science & MLOps Theatre - Day 2

2025-09-25 · Session spares for Mercury scanners

Face To Face

AI/ML Data Science

talk-data.com

Activity Trend

Top Events

Top Speakers

Data Engineering for Multimodal AI

Learning AutoML

Generative AI on Kubernetes

The AI Optimization Playbook

Functional Reproducibility

Operationalizing Responsible AI and Data Science in Healthcare with Nasibeh Zanirani Farahani

Building Machine Learning Systems with a Feature Store

#328 The Challenges of Enterprise Agentic AI with Manasi Vartak, Chief AI Architect at Cloudera

Feature feast: Building a data buffet for models

Best Practices for Scaling End-to-End ML Workloads in Snowflake ML

Best Practices for Scaling End-to-End ML Workloads in Snowflake ML

HP Z Centre de création IA: une suite d'outils ML Ops

Building Resilient (ML) Pipelines for MLOps

Berlin PyData 2025 Conference Interviews

Brian Cox Stream - AI, Data Science & MLOps Theatre

Automating MLOps Pipelines for Energy Flexibility

Continuous monitoring of model drift in the financial sector

Ops Overload? From MLOps to LLMOps with One Platform

Ops Overload? From MLOps to LLMOps with One Platform

Keynote Stream - AI, Data Science & MLOps Theatre - Day 2