talk-data.com talk-data.com

Topic

MLOps

machine_learning devops ai

16

tagged

Activity Trend

26 peak/qtr
2020-Q1 2026-Q1

Activities

16 activities · Newest first

Data Engineering for Multimodal AI

A shift is underway in how organizations approach data infrastructure for AI-driven transformation. As multimodal AI systems and applications become increasingly sophisticated and data hungry, data systems must evolve to meet these complex demands. Data Engineering for Multimodal AI is one of the first practical guides for data engineers, machine learning engineers, and MLOps specialists looking to rapidly master the skills needed to build robust, scalable data infrastructures for multimodal AI systems and applications. You'll follow the entire lifecycle of AI-driven data engineering, from conceptualizing data architectures to implementing data pipelines optimized for multimodal learning in both cloud native and on-premises environments. And each chapter includes step-by-step guides and best practices for implementing key concepts. Design and implement cloud native data architectures optimized for multimodal AI workloads Build efficient and scalable ETL processes for preparing diverse AI training data Implement real-time data processing pipelines for multimodal AI inference Develop and manage feature stores that support multiple data modalities Apply data governance and security practices specific to multimodal AI projects Optimize data storage and retrieval for various types of multimodal ML models Integrate data versioning and lineage tracking in multimodal AI workflows Implement data-quality frameworks to ensure reliable outcomes across data types Design data pipelines that support responsible AI practices in a multimodal context

Learning AutoML

Learning AutoML is your practical guide to applying automated machine learning in real-world environments. Whether you're a data scientist, ML engineer, or AI researcher, this book helps you move beyond experimentation to build and deploy high-performing models with less manual tuning and more automation. Using AutoGluon as a primary toolkit, you'll learn how to build, evaluate, and deploy AutoML models that reduce complexity and accelerate innovation. Author Kerem Tomak shares insights on how to integrate models into end-to-end deployment workflows using popular tools like Kubeflow, MLflow, and Airflow, while exploring cross-platform approaches with Vertex AI, SageMaker Autopilot, Azure AutoML, Auto-sklearn, and H2O.ai. Real-world case studies highlight applications across finance, healthcare, and retail, while chapters on ethics, governance, and agentic AI help future-proof your knowledge. Build AutoML pipelines for tabular, text, image, and time series data Deploy models with fast, scalable workflows using MLOps best practices Compare and navigate today's leading AutoML platforms Interpret model results and make informed decisions with explainability tools Explore how AutoML leads into next-gen agentic AI systems

Generative AI on Kubernetes

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to unlock AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way. With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively. Learn to run GenAI models on Kubernetes for efficient scalability Get techniques to train and fine-tune LLMs within Kubernetes environments See how to deploy production-ready AI systems with automation and resource optimization Discover how to monitor and scale GenAI applications to handle real-world demand Uncover the best tools to operationalize your GenAI workloads Learn how to run agent-based and AI-driven applications

The AI Optimization Playbook

Deliver measurable business value by applying strategic, technical, and ethical frameworks to AI initiatives at scale Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Build AI strategies that align with business goals and maximize ROI Implement enterprise-ready frameworks for MLOps, LLMOps, and Responsible AI Learn from real-world case studies spanning industries and AI maturity levels Book Description AI is only as valuable as the business outcomes it enables, and this hands-on guide shows you how to make that happen. Whether you’re a technology leader launching your first AI use case or scaling production systems, you need a clear path from innovation to impact. That means aligning your AI initiatives with enterprise strategy, operational readiness, and responsible practices, and The AI Optimization Playbook gives you the clarity, structure, and insight you need to succeed. Through actionable guidance and real-world examples, you’ll learn how to build high-impact AI strategies, evaluate projects based on ROI, secure executive sponsorship, and transition prototypes into production-grade systems. You’ll also explore MLOps and LLMOps practices that ensure scalability, reliability, and governance across the AI lifecycle. But deployment is just the beginning. This book goes further to address the crucial need for Responsible AI through frameworks, compliance strategies, and transparency techniques. Written by AI experts and industry leaders, this playbook combines technical fluency with strategic perspective to bridge the business–technology divide so you can confidently lead AI transformation across the enterprise. Email sign-up and proof of purchase required What you will learn Design business-aligned AI strategies Select and prioritize AI projects with the highest potential ROI Develop reliable prototypes and scale them using MLOps pipelines Integrate explainability, fairness, and compliance into AI systems Apply LLMOps practices to deploy and maintain generative AI models Build AI agents that support autonomous decision-making at scale Navigate evolving AI regulations with actionable compliance frameworks Build a future-ready, ethically grounded AI organization Who this book is for This book is for AI/ML leaders and business leaders, CTOs, CIOs, CDAOs, and CAIOs, responsible for driving innovation, operational efficiency, and risk mitigation through artificial intelligence. You should have familiarity with enterprise technology and the fundamentals of AI solution development.

Building Machine Learning Systems with a Feature Store

Get up to speed on a new unified approach to building machine learning (ML) systems with a feature store. Using this practical book, data scientists and ML engineers will learn in detail how to develop and operate batch, real-time, and agentic ML systems. Author Jim Dowling introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. You'll see how any AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, you'll tackle the hardest part of ML systems--the data, learning how to transform data into features and embeddings, and how to design a data model for AI. Develop batch ML systems at any scale Develop real-time ML systems by shifting left or shifting right feature computation Develop agentic ML systems that use LLMs, tools, and retrieval-augmented generation Understand and apply MLOps principles when developing and operating ML systems

Machine Learning for Tabular Data

Business runs on tabular data in databases, spreadsheets, and logs. Crunch that data using deep learning, gradient boosting, and other machine learning techniques. Machine Learning for Tabular Data teaches you to train insightful machine learning models on common tabular business data sources such as spreadsheets, databases, and logs. You’ll discover how to use XGBoost and LightGBM on tabular data, optimize deep learning libraries like TensorFlow and PyTorch for tabular data, and use cloud tools like Vertex AI to create an automated MLOps pipeline. Machine Learning for Tabular Data will teach you how to: Pick the right machine learning approach for your data Apply deep learning to tabular data Deploy tabular machine learning locally and in the cloud Pipelines to automatically train and maintain a model Machine Learning for Tabular Data covers classic machine learning techniques like gradient boosting, and more contemporary deep learning approaches. By the time you’re finished, you’ll be equipped with the skills to apply machine learning to the kinds of data you work with every day. About the Technology Machine learning can accelerate everyday business chores like account reconciliation, demand forecasting, and customer service automation—not to mention more exotic challenges like fraud detection, predictive maintenance, and personalized marketing. This book shows you how to unlock the vital information stored in spreadsheets, ledgers, databases and other tabular data sources using gradient boosting, deep learning, and generative AI. About the Book Machine Learning for Tabular Data delivers practical ML techniques to upgrade every stage of the business data analysis pipeline. In it, you’ll explore examples like using XGBoost and Keras to predict short-term rental prices, deploying a local ML model with Python and Flask, and streamlining workflows using large language models (LLMs). Along the way, you’ll learn to make your models both more powerful and more explainable. What's Inside Master XGBoost Apply deep learning to tabular data Deploy models locally and in the cloud Build pipelines to train and maintain models About the Reader For readers experienced with Python and the basics of machine learning. About the Authors Mark Ryan is the AI Lead of the Developer Knowledge Platform at Google. A three-time Kaggle Grandmaster, Luca Massaron is a Google Developer Expert (GDE) in machine learning and AI. He has published 17 other books. Quotes

Exam Ref DP-100 Designing and Implementing a Data Science Solution on Azure

Prepare for Microsoft Exam DP-100 and demonstrate your real-world knowledge of managing data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning, and MLflow. Designed for professionals with data science experience, this Exam Ref focuses on the critical thinking and decision-making acumen needed for success at the Microsoft Certified: Azure Data Scientist Associate level. Focus on the expertise measured by these objectives: Design and prepare a machine learning solution Explore data and train models Prepare a model for deployment Deploy and retrain a model This Microsoft Exam Ref: Organizes its coverage by exam objectives Features strategic, what-if scenarios to challenge you Assumes you have experience in designing and creating a suitable working environment for data science workloads, training machine learning models, and managing, deploying, and monitoring scalable machine learning solutions About the Exam Exam DP-100 focuses on knowledge needed to design and prepare a machine learning solution, manage an Azure Machine Learning workspace, explore data and train models, create models by using the Azure Machine Learning designer, prepare a model for deployment, manage models in Azure Machine Learning, deploy and retrain a model, and apply machine learning operations (MLOps) practices. About Microsoft Certification Passing this exam fulfills your requirements for the Microsoft Certified: Azure Data Scientist Associate credential, demonstrating your expertise in applying data science and machine learning to implement and run machine learning workloads on Azure, including knowledge and experience using Azure Machine Learning and MLflow.

LLM Engineer's Handbook

The "LLM Engineer's Handbook" is your comprehensive guide to mastering Large Language Models from concept to deployment. Written by leading experts, it combines theoretical foundations with practical examples to help you build, refine, and deploy LLM-powered solutions that solve real-world problems effectively and efficiently. What this Book will help me do Understand the principles and approaches for training and fine-tuning Large Language Models (LLMs). Apply MLOps practices to design, deploy, and monitor your LLM applications effectively. Implement advanced techniques such as retrieval-augmented generation (RAG) and preference alignment. Optimize inference for high performance, addressing low-latency and high availability for production systems. Develop robust data pipelines and scalable architectures for building modular LLM systems. Author(s) Paul Iusztin and Maxime Labonne are experienced AI professionals specializing in natural language processing and machine learning. With years of industry and academic experience, they are dedicated to making complex AI concepts accessible and actionable. Their collaborative authorship ensures a blend of theoretical rigor and practical insights tailored for modern AI practitioners. Who is it for? This book is tailored for AI engineers, NLP professionals, and LLM practitioners who wish to deepen their understanding of Large Language Models. Ideal readers possess some familiarity with Python, AWS, and general AI concepts. If you aim to apply LLMs to real-world scenarios or enhance your expertise in AI-driven systems, this handbook is designed for you.

Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms

This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code. The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows. What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world. What You Will Learn Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure Who This Book Is For Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists

Google Machine Learning and Generative AI for Solutions Architects

This book teaches solutions architects how to effectively design and implement AI/ML solutions utilizing Google Cloud services. Through detailed explanations, examples, and hands-on exercises, you will understand essential AI/ML concepts, tools, and best practices while building advanced applications. What this Book will help me do Build robust AI/ML solutions using Google Cloud tools such as TensorFlow, BigQuery, and Vertex AI. Prepare and process data efficiently for machine learning workloads. Establish and apply an MLOps framework for automating ML model lifecycle management. Implement cutting-edge generative AI solutions using best practices. Address common challenges in AI/ML projects with insights from expert solutions. Author(s) Kieran Kavanagh is a seasoned principal architect with nearly twenty years of experience in the tech industry. He has successfully led teams in designing, planning, and governing enterprise cloud strategies, and his wealth of experience is distilled into the practical approaches and insights in this book. Who is it for? This book is ideal for IT professionals aspiring to design AI/ML solutions, particularly in the role of solutions architects. It assumes a basic knowledge of Python and foundational AI/ML concepts but is suitable for both beginners and seasoned practitioners. If you're looking to deepen your understanding of state-of-the-art AI/ML applications on Google Cloud, this resource will guide you.

Architecting Data and Machine Learning Platforms

All cloud architects need to know how to build data platforms that enable businesses to make data-driven decisions and deliver enterprise-wide intelligence in a fast and efficient way. This handbook shows you how to design, build, and modernize cloud native data and machine learning platforms using AWS, Azure, Google Cloud, and multicloud tools like Snowflake and Databricks. Authors Marco Tranquillin, Valliappa Lakshmanan, and Firat Tekiner cover the entire data lifecycle from ingestion to activation in a cloud environment using real-world enterprise architectures. You'll learn how to transform, secure, and modernize familiar solutions like data warehouses and data lakes, and you'll be able to leverage recent AI/ML patterns to get accurate and quicker insights to drive competitive advantage. You'll learn how to: Design a modern and secure cloud native or hybrid data analytics and machine learning platform Accelerate data-led innovation by consolidating enterprise data in a governed, scalable, and resilient data platform Democratize access to enterprise data and govern how business teams extract insights and build AI/ML capabilities Enable your business to make decisions in real time using streaming pipelines Build an MLOps platform to move to a predictive and prescriptive analytics approach

Streaming Data Mesh

Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services. Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly. With this book, you will: Design a streaming data mesh using Kafka Learn how to identify a domain Build your first data product using self-service tools Apply data governance to the data products you create Learn the differences between synchronous and asynchronous data services Implement self-services that support decentralized data

Data Fabric and Data Mesh Approaches with AI: A Guide to AI-based Data Cataloging, Governance, Integration, Orchestration, and Consumption

Understand modern data fabric and data mesh concepts using AI-based self-service data discovery and delivery capabilities, a range of intelligent data integration styles, and automated unified data governance—all designed to deliver "data as a product" within hybrid cloud landscapes. This book teaches you how to successfully deploy state-of-the-art data mesh solutions and gain a comprehensive overview on how a data fabric architecture uses artificial intelligence (AI) and machine learning (ML) for automated metadata management and self-service data discovery and consumption. You will learn how data fabric and data mesh relate to other concepts such as data DataOps, MLOps, AIDevOps, and more. Many examples are included to demonstrate how to modernize the consumption of data to enable a shopping-for-data (data as a product) experience. By the end of this book, you will understand the data fabric concept and architecture as it relates to themes such as automated unifieddata governance and compliance, enterprise information architecture, AI and hybrid cloud landscapes, and intelligent cataloging and metadata management. What You Will Learn Discover best practices and methods to successfully implement a data fabric architecture and data mesh solution Understand key data fabric capabilities, e.g., self-service data discovery, intelligent data integration techniques, intelligent cataloging and metadata management, and trustworthy AI Recognize the importance of data fabric to accelerate digital transformation and democratize data access Dive into important data fabric topics, addressing current data fabric challenges Conceive data fabric and data mesh concepts holistically within an enterprise context Become acquainted with the business benefits of data fabric and data mesh Who This Book Is For Anyone who is interested in deploying modern data fabric architectures and data mesh solutions within an enterprise, including IT and business leaders, data governance and data office professionals, data stewards and engineers, data scientists, and information and data architects. Readers should have a basic understanding of enterprise information architecture.

Effective Data Science Infrastructure

Simplify data science infrastructure to give data scientists an efficient path from prototype to production. In Effective Data Science Infrastructure you will learn how to: Design data science infrastructure that boosts productivity Handle compute and orchestration in the cloud Deploy machine learning to production Monitor and manage performance and results Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, Conda, and Docker Architect complex applications for multiple teams and large datasets Customize and grow data science infrastructure Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you’ll master scalable techniques for data storage, computation, experiment tracking, and orchestration that are relevant to companies of all shapes and sizes. You’ll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python. The author is donating proceeds from this book to charities that support women and underrepresented groups in data science. About the Technology Growing data science projects from prototype to production requires reliable infrastructure. Using the powerful new techniques and tooling in this book, you can stand up an infrastructure stack that will scale with any organization, from startups to the largest enterprises. About the Book Effective Data Science Infrastructure teaches you to build data pipelines and project workflows that will supercharge data scientists and their projects. Based on state-of-the-art tools and concepts that power data operations of Netflix, this book introduces a customizable cloud-based approach to model development and MLOps that you can easily adapt to your company’s specific needs. As you roll out these practical processes, your teams will produce better and faster results when applying data science and machine learning to a wide array of business problems. What's Inside Handle compute and orchestration in the cloud Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, AWS, and the Python data ecosystem Architect complex applications that require large datasets and models, and a team of data scientists About the Reader For infrastructure engineers and engineering-minded data scientists who are familiar with Python. About the Author At Netflix, Ville Tuulos designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure. Quotes By reading and referring to this book, I’m confident you will learn how to make your machine learning operations much more efficient and productive. - From the Foreword by Travis Oliphant, Author of NumPy, Founder of Anaconda, PyData, and NumFOCUS Effective Data Science Infrastructure is a brilliant book. It’s a must-have for every data science team. - Ninoslav Cerkez, Logit More data science. Less headaches. - Dr. Abel Alejandro Coronado Iruegas, National Institute of Statistics and Geography of Mexico Indispensable. A copy should be on every data engineer’s bookshelf. - Matthew Copple, Grand River Analytics

ML Ops: Operationalizing Data Science

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren’t truly operational, these models can’t possibly do what you’ve trained them to do. This report introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach—Build, Manage, Deploy and Integrate, and Monitor—for creating ML-infused applications within your organization. You’ll learn how to: Fulfill data science value by reducing friction throughout ML pipelines and workflows Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action