data

Head First Statistics for Data Analysis

2027-07-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Dawn Griffiths

data-science data-science-tasks statistics

What will you learn from this book? Do you need to analyze data but feel lost in a sea of numbers? Your guide is here—without the dry, academic jargon. This hands-on, visually rich book introduces key statistical concepts and shows you how to apply them using Excel. Whether you're a data analyst, a business professional, or just someone who wants to make better decisions with data, you'll gain the practical skills needed to extract meaningful insights. From probability and confidence intervals to regression and forecasting, this book makes statistics approachable, relevant, and—even better—understandable. What's so special about this book? If you've read a Head First book before, you know what to expect: a uniquely engaging, brain-friendly approach that helps you truly learn instead of struggling through dense theory. Through clear explanations, hands-on exercises, and interactive visuals, you'll develop the skills to confidently analyze data and make informed decisions. No more guesswork—just real statistical insights at your fingertips.

Snowflake: The Definitive Guide, 2nd Edition

2027-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Joyce Kaye Avila

AI/ML Analytics Cloud Computing Data Governance Data Management GenAI Iceberg Cyber Security Snowflake SQL data-engineering

Snowflake is reshaping data management by integrating AI, analytics, and enterprise workloads into a single cloud platform. Snowflake: The Definitive Guide is a comprehensive resource for data architects, engineers, and business professionals looking to harness Snowflake's evolving capabilities, including Cortex AI, Snowpark, and Polaris Catalog for Apache Iceberg. This updated edition provides real-world strategies and hands-on activities for optimizing performance, securing data, and building AI-driven applications. With hands-on SQL examples and best practices, this book helps readers process structured and unstructured data, implement scalable architectures, and integrate Snowflake's AI tools seamlessly. Whether you're setting up accounts, managing access controls, or leveraging generative AI, this guide equips you with the expertise to maximize Snowflake's potential. Implement AI-powered workloads with Snowflake Cortex Explore Snowsight and Streamlit for no-code development Ensure security with access control and data governance Optimize storage, queries, and computing costs Design scalable data architectures for analytics and machine learning

The Data Engineer's Guide to Microsoft Fabric

2027-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Christian Henrik Reich (twoday Data & AI)

Data Engineering Data Lakehouse Databricks ETL/ELT Microsoft Fabric Python Spark SQL Data Streaming analytics-platforms data-science +1 more

Modern data engineering is evolving; and with Microsoft Fabric, the entire data platform experience is being redefined. This essential book offers a fresh, hands-on approach to navigating this shift. Rather than being an introduction to features, this guide explains how Fabric's key components—Lakehouse, Warehouse, and Real-Time Intelligence—work under the hood and how to put them to use in realistic workflows. Written by Christian Henrik Reich, a data engineering expert with experience that extends from Databricks to Fabric, this book is a blend of foundational theory and practical implementation of lakehouse solutions in Fabric. You'll explore how engines like Apache Spark and Fabric Warehouse collaborate with Fabric's Real-Time Intelligence solution in an integrated platform, and how to build ETL/ELT pipelines that deliver on speed, accuracy, and scale. Ideal for both new and practicing data engineers, this is your entry point into the fabric of the modern data platform. Acquire a working knowledge of lakehouses, warehouses, and streaming in Fabric Build resilient data pipelines across real-time and batch workloads Apply Python, Spark SQL, T-SQL, and KQL within a unified platform Gain insight into architectural decisions that scale with data needs Learn actionable best practices for engineering clean, efficient, governed solutions

Universal Data Modeling

2027-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jun Shan

AI/ML Analytics Data Modelling Data Quality NoSQL data-engineering data-models

Most data professionals work with multiple datasets scattered across teams, systems, and formats. But without a clear modeling strategy, the result is often chaos: mismatched schemas, fragile pipelines, and a constant fight to make sense of the noise. This essential guide offers a better way by introducing a practical framework for designing high-quality data models that work across platforms while supporting the growing demands of AI, analytics, and real-time systems. Author Jun Shan bridges the gap between disconnected modeling approaches and the need for a unified, system-agnostic methodology. Whether you're building a new data platform or rethinking legacy infrastructure, Universal Data Modeling gives you the clarity, patterns, and tools to model data that's consistent, resilient, and ready to scale. Connect conceptual, logical, and physical modeling phases with confidence Apply best-fit techniques across relational, semistructured, and NoSQL formats Improve data quality, clarity, and maintainability across your organization Support modern design paradigms like data mesh and data products Translate domain knowledge into models that empower teams Build flexible, scalable models that stand the test of technology change

PostgreSQL: Up and Running, 4th Edition

2027-02-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Leo S. Hsu , Regina Obe

JSON SQL data-engineering postgresql relational-databases

Thinking of migrating to PostgreSQL? This concise introduction helps you understand and use this open source database system. Not only will you learn about the new enterprise class features in versions 16 to 18, but you'll also discover all that PostgreSQL has to offer—much more than a relational database system. As an open source product, it has hundreds of plug-ins, expanding the capability of PostgreSQL beyond all other database systems. With examples throughout, this book shows you how to perform tasks that are difficult or impossible in other databases. The revised fourth edition covers the latest features of Postgres, such as ISO-SQL constructs rarely found in other databases, foreign data wrapper (FDW) enhancements, JSON constructs, multirange data types, query parallelization, and replication. If you're an experienced PostgreSQL user, you'll pick up gems you may have missed before. Learn basic administration tasks such as role management, database creation, backup, and restore Use psql command-line utility and the pgAdmin graphical administration tool Explore PostgreSQL tables, constraints, and indexes Learn powerful SQL constructs not generally found in other databases Use several different languages to write database functions and stored procedures Tune your queries to run as fast as your hardware will allow Query external and variegated data sources with foreign data wrappers Learn how to use built-in replication to replicate data

AI Engineering Interviews

2026-12-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mina Ghashami , Ali Torkamani

AI/ML GenAI ai-ml artificial-intelligence-ai generative-ai

Generative AI is rapidly spreading across industries, and companies are actively hiring people who can design, build, and deploy these systems. But to land one of these roles, you'll have to get through the interview first. Generative AI Interviews walks you through every stage of the interview process, giving you an insider's perspective that will help you build confidence and stand out. This handy guide features 300 real-world interview questions organized by difficulty level, each with a clear outline of what makes a good answer, common pitfalls to avoid, and key points you shouldn't miss. What sets this book apart from others is Mina Ghashami and Ali Torkamani's knack for simplifying complex concepts into intuitive explanations, accompanied by compelling illustrations that make learning engaging. If you're looking for a guide to cracking GenAI interviews, this is it. Master GenAI interviews for roles from fundamental to advanced Explore 300 real industry interview questions with model answers and breakdowns Learn a step-by-step approach to explaining architecture, training, inference, and evaluation Get actionable insights that will help you stand out in even the most competitive hiring process

An Illustrated Guide to AI Agents

2026-12-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Jay Alammar (Cohere) , Maarten Grootendorst

AI/ML LLM ai-ml artificial-intelligence-ai generative-ai

Artificial intelligence is entering a new phase. No longer limited to answering prompts or completing simple writing tasks, AI agents can now reason, plan, and act with increasing independence. From accelerating scientific breakthroughs to supporting creative work, these systems are quickly reshaping industries and everyday life. This book provides the conceptual foundation and practical insights you need to understand—and effectively work with—this emerging technology. Through hundreds of clear graphic illustrations, Maarten Grootendorst and Jay Alammar explain how AI agents are built, how they think, and where they're heading. Designed for professionals, students, and curious learners alike, this guide goes beyond the buzz to reveal what's actually happening inside these systems, why it matters, and how to apply the knowledge in real-world contexts. With its visual storytelling and accessible explanations, An Illustrated Guide to AI Agents is your essential reference for navigating the next frontier of artificial intelligence. Explore the core architecture of AI agents: tools, memory, and planning Understand reasoning LLMs, multimodal models, and multi-agent collaboration Learn advanced methods, including distillation, quantization, and reinforcement learning Evaluate real-world applications, strengths, and limitations of AI agents

Context Engineering with DSPy

2026-12-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mike Taylor (Hopkins Architects)

AI/ML LLM RAG ai-ml artificial-intelligence-ai generative-ai

AI agents need the right context at the right time to do a good job. Too much input increases cost and harms accuracy, while too little causes instability and hallucinations. Context Engineering with DSPy introduces a practical, evaluation-driven way to design AI systems that remain reliable, predictable, and easy to maintain as they grow. AI engineer and educator Mike Taylor explains DSPy in a clear, approachable style, showing how its modular structure, portable programs, and built-in optimizers help teams move beyond guesswork. Through real examples and step-by-step guidance, you'll learn how DSPy's signatures, modules, datasets, and metrics work together to solve context engineering problems that evolve as models change and workloads scale. This book supports AI engineers, data scientists, machine learning practitioners, and software developers building AI agents, retrieval-augmented generation (RAG) systems, and multistep reasoning workflows that hold up in production. Understand the core ideas behind context engineering and why they matter Structure LLM pipelines with DSPy's maintainable, reusable components Apply evaluation-driven optimizers like GEPA and MIPROv2 for measurable improvements Create reproducible RAG and agentic workflows with clear metrics Develop AI systems that stay robust across providers, model updates, and real-world constraints

Building Data Products

2026-11-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jean-Georges Perrin (Actian)

AI/ML API CI/CD Data Contracts DevOps Cyber Security data-engineering

As organizations grapple with fragmented data, siloed teams, and inconsistent pipelines, data products have emerged as a practical solution for delivering trusted, scalable, and reusable data assets. In Building Data Products, Jean-Georges Perrin provides a comprehensive, standards-driven playbook for designing, implementing, and scaling data products that fuel innovation and cross-functional collaboration—whether or not your organization adopts a full data mesh strategy. Drawing on extensive industry experience and practitioner interviews, Perrin shows readers how to build metadata-rich, governed data products aligned to business domains. Covering foundational concepts, real-world use cases, and emerging standards like Bitol ODPS and ODCS, this guide offers step-by-step implementation advice and practical code examples for key stages—ownership, observability, active metadata, compliance, and integration. Design data products for modular reuse, discoverability, and trust Implement standards-driven architectures with rich metadata and security Incorporate AI-driven automation, SBOMs, and data contracts Scale product-driven data strategies across teams and platforms Integrate data products into APIs, CI/CD pipelines, and DevOps practices

Evals for AI Engineers

2026-10-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Hamel Husain , Shreya Shankar

AI/ML LLM ai-ml artificial-intelligence-ai artificial intelligence (ai)

Stop using guesswork to find out how your AI applications are performing. Evals for AI Engineers equips you with the proven tools and processes required to systematically test, measure, and enhance the reliability of AI applications, especially those using LLMs. Written by AI engineers with extensive experience in real-world consulting (across 35+ AI products) and cutting-edge research, this practical resource will help you move from assumptions to robust, data-driven evaluation. Ideal for software engineers, technical product managers, and technical leads, this hands-on guide dives into techniques like error analysis, synthetic data generation, automated LLM-as-a-judge systems, production monitoring, and cost optimization. You'll learn how to debug LLM behavior, design test suites based on synthetic and real data, and build data flywheels that improve over time. Whether you're starting without user data or scaling a production system, you'll gain the skills to build AI you can trust—with processes that are repeatable, measurable, and aligned with real-world outcomes. Run systematic error analyses to uncover, categorize, and prioritize failure modes Build, implement, and automate evaluation pipelines using code-based and LLM-based metrics Optimize AI performance and costs through smart evaluation and feedback loops Apply key principles and techniques for monitoring AI applications in production

Analytics Engineering with Microsoft Fabric and Power BI

2026-09-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Nikola Ilic , Shabnam Watson

Analytics Analytics Engineering BI Data Analytics Microsoft Fabric Power BI business-intelligence data-science microsoft-power-platform power-bi

While Microsoft Power BI has dominated the business intelligence market for years and is a go-to tool for creating visually appealing, interactive reports and dashboards, it's now an integral part of Microsoft Fabric, the end-to-end analytics platform that offers unprecedented flexibility and scalability for building enterprise-grade data analytics solutions. This book covers everything analytics engineers need to know to design and implement robust and efficient analytics solutions using Microsoft Fabric and Power BI. You'll learn the core components of Fabric, such as lakehouses, warehouses, and eventhouses, and how to work with semantic models, ensuring that data is structured and ready for analysis. You'll also discover essential techniques in both Microsoft Fabric and Power BI that you can apply in your day-to-day work. Explore the core components of Microsoft Fabric Implement, manage, and optimize Power BI semantic models Discover numerous architectural solutions with Microsoft Fabric and Power BI Build Fabric items such as lakehouses, warehouses, semantic models, and more, and share them within your organization Identify when to use a particular Fabric item or implement a particular design pattern Implement the analytics development lifecycle Optimize and fine-tune existing analytics solutions

Causal Inference with Bayesian Networks

2026-09-18 · O'Reilly Data Science Books O'Reilly Amazon

book

by Yousri El Fattah , Reza Bagheri

AI/ML Python bayesian-statistics data-science data-science-tasks statistics

Leverage the power of graphical models for probabilistic and causal inference to build knowledge-based system applications and to address causal effect queries with observational data for decision aiding and policy making. Key Features Gain a firm understanding of Bayesian networks and structured algorithms for probabilistic inference Acquire a comprehensive understanding of graphical models and their applications in causal inference Gain insights into real-world applications of causal models in multiple domains Enhance your coding skills in R and Python through hands-on examples of causal inference Book Description This is a practical guide that explores the theory and application of Bayesian networks (BN) for probabilistic and causal inference. The book provides step-by-step explanations of graphical models of BN and their structural properties; the causal interpretations of BN and the notion of conditioning by intervention; and the mathematical model of structural equations and the representation in structured causal models (SCM). For probabilistic inference in Bayesian networks, you will learn methods of variable elimination and tree clustering. For causal inference you will learn the computational framework of Pearl's do-calculus for the identification and estimation of causal effects with causal models. In the context of causal inference with observational data, you will be introduced to the potential outcomes framework and explore various classes of meta-learning algorithms that are used to estimate the conditional average treatment effect in causal inference. The book includes practical exercises using R and Python for you to engage in and solidify your understanding of different approaches to probabilistic and causal inference. By the end of this book, you will be able to build and deploy your own causal inference application. You will learn from causal inference sample use cases for diagnosis, epidemiology, social sciences, economics, and finance. What you will learn Representation of knowledge with Bayesian networks Interpretation of conditional independence assumptions Interpretation of causality assumptions in graphical models Probabilistic inference with Bayesian networks Causal effect identification and estimation Machine learning methods for causal inference Coding in R and Python for probabilistic and causal inference Who this book is for This book will serve as a valuable resource for a wide range of professionals including data scientists, software engineers, policy analysts, decision-makers, information technology professionals involved in developing expert systems or knowledge-based applications that deal with uncertainty, as well as researchers across diverse disciplines seeking insights into causal analysis and estimating treatment effects in randomized studies. The book will enable readers to leverage libraries in R and Python and build software prototypes for their own applications.

AI Agents with MCP

2026-08-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Kyle Stratis (Stratis Data Labs)

AI/ML GenAI Python ai-agents ai-ml artificial-intelligence-ai

Since its release in late 2024, Anthropic's Model Context Protocol (MCP) has redefined how developers build and connect AI agents to tools, data, and each other. AI Agents with MCP is the first comprehensive guide to this rapidly emerging standard, helping engineers unlock its full potential with hands-on projects. Whether you're developing agentic workflows, bridging tools across platforms, or creating robust multiagent systems, this book walks you through every layer of MCP--from protocol structure to server and client implementation. Author Kyle Stratis provides the practical expertise needed to build fully functional MCP servers, clients, and more. Unlike high-level overviews or fragmented documentation, this book gives you a deep systems-level understanding of MCP's capabilities--and limitations. With its flexible, model-agnostic design, MCP continues to gain traction across the generative AI community; this book ensures you're ready to build with it confidently and effectively. Understand the structure and core concepts of the Model Context Protocol Build complete MCP servers, clients, and transport layers in Python Consume tools, prompts, and data via MCP-based agent workflows Extend agent capabilities with MCP for large-scale and AI-native systems

Practical Statistics for Data Scientists, 3rd Edition

2026-08-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Andrew Bruce , Peter Bruce , Peter Gedeck

Data Science Python data-science data-science-tasks statistics

Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. And many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.

Elasticsearch Query Language the Definitive Guide

2026-06-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bahaaldine Azarmi , Alexis Charveriat , Stephen Brown , Farbod Shirzadian , Alejandro Sanchez

Analytics BI Data Analytics Data Management ELK Cyber Security data-engineering elasticsearch search

Streamline your workflow with ESQL enhance data analysis with real-time insights, and speed up aggregations and visualizations Key Features Apply ESQL efficiently in analytics, observability, and cybersecurity Optimize performance and scalability for high-demand environments Discover how to visualize and debug ESQL queries Purchase of the print or Kindle book includes a free PDF eBook Book Description Built to simplify high-scale data analytics in Elasticsearch, this practical guide will take you from foundational concepts to advanced applications across search, observability, and security. It will help you overcome common challenges such as efficiently querying large datasets, applying advanced analytics without deep prior knowledge, and resolving for a unique and consolidated query language. Written by senior experts at Elastic with extensive field experience, this book delivers actionable guidance rooted in solving today’s data challenges at scale. After introducing ESQL and its architecture, the chapters explore real-world applications across various domains, including analytics, raw log analysis, observability, and cybersecurity. Advanced topics such as scaling, optimization, and future developments are also covered to help you maximize your ESQL capabilities. By the end of this book, you’ll be able to leverage ESQL for comprehensive data management and analysis, optimizing your workflows and enhancing your productivity with Elasticsearch. What you will learn Gain a solid understanding of ESQL and its architecture Use ESQL for data analysis and performance monitoring Apply ESQL in cybersecurity for threat detection and incident response Find out how to perform advanced searches using ESQL Prepare for future ESQL developments Showcase ESQL in action through real-world, persona-driven use cases Who this book is for If you’re an Elasticsearch user, this book is essential for your growth. Whether you’re a data analyst looking to build analytics on top of Elasticsearch, an SRE monitoring the health of your IT system, or a cybersecurity analyst, this book will give you a complete understanding of how ESQL is built and used. Additionally, database administrators, business intelligence professionals, and operational intelligence professionals will find this book invaluable. Even with a beginner-level knowledge of Elasticsearch, you’ll be able to get started and make the most of this comprehensive guide.

Learn D3.js - Second Edition

2026-06-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Helder Da Rocha

DataViz HTML JavaScript React d3 data-science data-science-tasks data-visualization

Master data visualization with D3.js v7 using modern web standards and real-world projects to build interactive charts, maps, and visual narratives Key Features Build dynamic, data-driven visualizations using D3.js v7 and ES2015+ Create bar, scatter, and network charts, geographic maps, and more Learn through step-by-step tutorials backed by hundreds of downloadable examples Purchase of the print or Kindle book includes a free PDF eBook Book Description Learn D3.js, Second Edition, is a fully updated guide to building interactive, standards-compliant web visualizations using D3.js v7 and modern JavaScript. Whether you're a developer, designer, data journalist, or analyst, this book will help you master the core techniques for transforming data into compelling, meaningful visuals. Starting with fundamentals like selections, data binding, and SVG, the book progressively covers scales, axes, animations, hierarchical data, and geographical maps. Each chapter includes short examples and a full hands-on project with downloadable code you can run, modify, and use in your own work. This new edition introduces improved chapter structure, updated code samples using ES2015 standards, and better formatting for readability. There’s also a dedicated chapter that focuses on integrating D3 with modern frameworks like React and Vue, along with performance, accessibility, and deployment strategies. For those migrating from older versions of D3, a detailed appendix is included at the end. With thoughtful pedagogy and a practical approach, this book remains one of the most thorough and respected resources for learning D3.js and help you truly leverage data visualisation. What you will learn Bind data to DOM elements and apply transitions and styles Build bar, line, pie, scatter, tree, and network charts Create animated, interactive behaviours with zoom, drag, and tooltips Visualize hierarchical data, flows, and maps using D3 layouts and projections Use D3 with HTML5 Canvas for high-performance rendering Develop accessible and responsive D3 apps for all screen sizes Integrate D3 with frameworks like React and Vue Migrate older D3 codebases to version 7 Who this book is for This book is for web developers, data journalists, designers, analysts, and anyone who wants to create interactive, web-based data visualizations. A basic understanding of HTML, CSS, and JavaScript is recommended. No prior knowledge of SVG or D3 is required.

Data Engineering for Multimodal AI

2026-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vasundra Srinivasan

AI/ML Cloud Computing Data Engineering Data Governance ETL/ELT MLOps Cyber Security data-engineering

A shift is underway in how organizations approach data infrastructure for AI-driven transformation. As multimodal AI systems and applications become increasingly sophisticated and data hungry, data systems must evolve to meet these complex demands. Data Engineering for Multimodal AI is one of the first practical guides for data engineers, machine learning engineers, and MLOps specialists looking to rapidly master the skills needed to build robust, scalable data infrastructures for multimodal AI systems and applications. You'll follow the entire lifecycle of AI-driven data engineering, from conceptualizing data architectures to implementing data pipelines optimized for multimodal learning in both cloud native and on-premises environments. And each chapter includes step-by-step guides and best practices for implementing key concepts. Design and implement cloud native data architectures optimized for multimodal AI workloads Build efficient and scalable ETL processes for preparing diverse AI training data Implement real-time data processing pipelines for multimodal AI inference Develop and manage feature stores that support multiple data modalities Apply data governance and security practices specific to multimodal AI projects Optimize data storage and retrieval for various types of multimodal ML models Integrate data versioning and lineage tracking in multimodal AI workflows Implement data-quality frameworks to ensure reliable outcomes across data types Design data pipelines that support responsible AI practices in a multimodal context

Generative AI on Microsoft Azure

2026-05-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Jorge Garcia Ximenez , Jaime De Mora , Adrian Gonzalez Sanchez (Microsoft)

AI/ML Azure Databricks GenAI GitHub Microsoft RAG Snowflake ai-ml artificial-intelligence-ai generative-ai

Companies are now moving generative AI projects from the lab to production environments. To support these increasingly sophisticated applications, they're turning to advanced practices such as multiagent architectures and complex code-based frameworks. This practical handbook shows you how to leverage cutting-edge techniques using Microsoft's powerful ecosystem of tools to deploy trustworthy AI systems tailored to your organization's needs. Written for and by AI professionals, Generative AI on Microsoft Azure goes beyond the technical core aspects, examining underlying principles, tools, and practices in depth, from the art of prompt engineering to strategies for fine-tuning models to advanced techniques like retrieval-augmented generation (RAG) and agentic AI. Through real-world case studies and insights from top experts, you'll learn how to harness AI's full potential on Azure, paving the way for groundbreaking solutions and sustainable success in today's AI-driven landscape. Understand the technical foundations of generative AI and how the technology has evolved over the last few years Implement advanced GenAI applications using Microsoft services like Azure AI Foundry, Copilot, GitHub Models, Azure Databricks, and Snowflake on Azure Leverage patterns, tools, frameworks, and platforms to customize AI projects Manage, govern, and secure your AI-enabled systems with responsible AI practices Build upon expert guidance to avoid common pitfalls, future-proof your applications, and more

High Performance Spark, 2nd Edition

2026-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rachel Warren , Holden Karau (Fight Health Insurance) , Adi Polak (Treeverse)

AI/ML Data Science Kubernetes PySpark PyTorch Spark apache-spark data-engineering

Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau, Rachel Warren, and Anya Bida walk you through the secrets of the Spark code base, and demonstrate performance optimizations that will help your data pipelines run faster, scale to larger datasets, and avoid costly antipatterns. Ideal for data engineers, software engineers, data scientists, and system administrators, the second edition of High Performance Spark presents new use cases, code examples, and best practices for Spark 3.x and beyond. This book gives you a fresh perspective on this continually evolving framework and shows you how to work around bumps on your Spark and PySpark journey. With this book, you'll learn how to: Accelerate your ML workflows with integrations including PyTorch Handle key skew and take advantage of Spark's new dynamic partitioning Make your code reliable with scalable testing and validation techniques Make Spark high performance Deploy Spark on Kubernetes and similar environments Take advantage of GPU acceleration with RAPIDS and resource profiles Get your Spark jobs to run faster Use Spark to productionize exploratory data science projects Handle even larger datasets with Spark Gain faster insights by reducing pipeline running times

Designing AI Interfaces

2026-04-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Louise Macfadyen

AI/ML LLM ai-ml artificial-intelligence-ai generative-ai

As artificial intelligence becomes central to modern product design, UX professionals must adapt their toolkits to meet new demands. In Designing AI Interfaces, senior product designer Louise Macfadyen offers a timely, practice-oriented guide for building intuitive, ethical, and effective user experiences with large language models (LLMs) and autonomous AI systems. From content moderation to interruptibility, this book presents actionable design patterns for today's most advanced AI interactions—with clear technical insights to help designers understand how AI systems process inputs, generate outputs, and make decisions on users' behalf. Written specifically for product designers navigating the AI transition, this book provides concrete strategies for managing risk, enabling transparency, and fostering user trust in increasingly agentic systems. Readers will learn how to enable users to steer and shape AI responses in real time, incorporate ethical and UX principles into actionable design strategies, and navigate trade-offs in autonomy and control—all while gaining fluency in key AI concepts to collaborate more effectively with engineering teams. Design effective and ethical interfaces for LLMs and AI agents Apply best-practice patterns for content warnings, permissions, and oversight Gain a mental model for how AI systems reason and act Collaborate confidently with engineering and product teams Evaluate your org's AI maturity and advocate for responsible implementation

talk-data.com

Activity Trend

Top Events

Top Speakers

Head First Statistics for Data Analysis

Snowflake: The Definitive Guide, 2nd Edition

The Data Engineer's Guide to Microsoft Fabric

Universal Data Modeling

PostgreSQL: Up and Running, 4th Edition

AI Engineering Interviews

An Illustrated Guide to AI Agents

Context Engineering with DSPy

Building Data Products

Evals for AI Engineers

Analytics Engineering with Microsoft Fabric and Power BI

Causal Inference with Bayesian Networks

AI Agents with MCP

Practical Statistics for Data Scientists, 3rd Edition

Elasticsearch Query Language the Definitive Guide

Learn D3.js - Second Edition

Data Engineering for Multimodal AI

Generative AI on Microsoft Azure

High Performance Spark, 2nd Edition

Designing AI Interfaces