O'Reilly Data Engineering Books

Hands-On Software Engineering with Python - Second Edition

2025-12-23 O'Reilly Amazon

book

Brian Allbee

software-development programming-languages Python Agile/Scrum CI/CD Cloud Computing

Grow your software engineering discipline, incorporating and mastering design, development, testing, and deployment best practices examples in a realistic Python project structure. Key Features Understand what makes Software Engineering a discipline, distinct from basic programming Gain practical insight into updating, refactoring, and scaling an existing Python system Implement robust testing, CI/CD pipelines, and cloud-ready architecture decisions Book Description Software engineering is more than coding; it’s the strategic design and continuous improvement of systems that serve real-world needs. This newly updated second edition of Hands-On Software Engineering with Python expands on its foundational approach to help you grow into a senior or staff-level engineering role. Fully revised for today’s Python ecosystem, this edition includes updated tooling, practices, and architectural patterns. You’ll explore key changes across five minor Python versions, examine new features like dataclasses and type hinting, and evaluate modern tools such as Poetry, pytest, and GitHub Actions. A new chapter introduces high-performance computing in Python, and the entire development process is enhanced with cloud-readiness in mind. You’ll follow a complete redesign and refactor of a multi-tier system from the first edition, gaining insight into how software evolves—and what it takes to do that responsibly. From system modeling and SDLC phases to data persistence, testing, and CI/CD automation, each chapter builds your engineering mindset while updating your hands-on skills. By the end of this book, you'll have mastered modern Python software engineering practices and be equipped to revise and future-proof complex systems with confidence. What you will learn Distinguish software engineering from general programming Break down and apply each phase of the SDLC to Python systems Create system models to plan architecture before writing code Apply Agile, Scrum, and other modern development methodologies Use dataclasses, pydantic, and schemas for robust data modeling Set up CI/CD pipelines with GitHub Actions and cloud build tools Write and structure unit, integration, and end-to-end tests Evaluate and integrate tools like Poetry, pytest, and Docker Who this book is for This book is for Python developers with a basic grasp of software development who want to grow into senior or staff-level engineering roles. It’s ideal for professionals looking to deepen their understanding of software architecture, system modeling, testing strategies, and cloud-aware development. Familiarity with core Python programming is required, as the book focuses on applying engineering principles to maintain, extend, and modernize real-world systems.

Building a Data and AI Platform with PostgreSQL

2025-12-12 O'Reilly Amazon

book

Jozef de Vries , Tom Taulli , Benjamin Anderson

data data-engineering relational-databases postgresql AI/ML Data Management

In a world where data sovereignty, scalability, and AI innovation are at the forefront of enterprise strategy, PostgreSQL is emerging as the key to unlocking transformative business value. This new guide serves as your beacon for navigating the convergence of AI, open source technologies, and intelligent data platforms. Authors Tom Taulli, Benjamin Anderson, and Jozef de Vries offer a strategic and practical approach to building AI and data platforms that balance innovation with governance, empowering organizations to take control of their data future. Whether you're designing frameworks for advanced AI applications, modernizing legacy infrastructures, or solving data challenges at scale, you can use this guide to bridge the gap between technical complexity and actionable strategy. Written for IT executives, data leaders, and practitioners alike, it will equip you with the tools and insights to harness Postgre's unique capabilities—extensibility, unstructured data management, and hybrid workloads—for long-term success in an AI-driven world. Learn how to build an AI and data platform using PostgreSQL Overcome data challenges like modernization, integration, and governance Optimize AI performance with model fine-tuning and retrieval-augmented generation (RAG) best practices Discover use cases that align data strategy with business goals Take charge of your data and AI future with this comprehensive and accessible roadmap

Just Use Postgres!

2025-11-19 O'Reilly Amazon

book

Denis Magda

data data-engineering relational-databases postgresql AI/ML GenAI

You probably don’t need a collection of specialty databases. Just use Postgres instead! Written for application developers and database pros, Just Use Postgres! shows you how to get the most out of the powerful Postgres database. In Just Use Postgres! you’ll learn how to: Use Postgres as an RDBMS for transactional workloads Develop generative AI, geospatial, and time-series applications Take advantage of modern SQL including window functions and CTEs Perform full-text search and process JSON documents Use Postgres as a message queue Optimize performance with various index types including B-trees, GIN, GiST, HNSW, and more Over the decades, PostgreSQL, aka Postgres, has grown into the most powerful general-purpose database and has become the de facto standard for developers worldwide. Just Use Postgres! takes a modern look at Postgres, exploring the database’s most up-to-date features for AI, time-series, full-text search, geospatial, and other application workloads. About the Technology You know that PostgreSQL is a fast, reliable, SQL compliant RDBMS. You may not know that it’s also great for geospatial systems, time series, full-text search, JSON documents, AI vector embeddings, and many other specialty database functions. For almost any data task you can imagine, you can use Postgres. About the Book Just Use Postgres! covers recipes for using Postgres in dozens of applications normally reserved for single-purpose databases. Written for busy application developers, each chapter explores a different use case illuminating the breadth and depth of Postgres’s capabilities. Along the way, you’ll also meet an incredible ecosystem of Postgres extensions like pgvector, PostGIS, pgmq, and TimescaleDB. You’ll be amazed at everything you can accomplish with Postgres! What's Inside Generative AI, geospatial, and time-series applications Modern SQL including window functions and CTEs Full-text search and JSON B-trees, GIN, GiST, HNSW, and more About the Reader For application developers, software engineers, and architects who know the basics of SQL. About the Author Denis Magda is a recognized Postgres expert and software engineer who worked on Java at Sun Microsystems and Oracle before focusing on databases and large-scale distributed systems. Quotes I was pleasantly surprised to learn many new things from this book. - From the Afterword by Vlad Mihalcea An excellent guide covering everything from basics to cutting-edge features. - Dave Cramer, PostgreSQL JDBC Maintainer Pleasant, easy to read with tonnes of great code. - Mike McQuillan, McQTech Ltd Well-organized and easy to search. - Edward Pollack, Microsoft Data Platform MVP The missing guide to understanding and using Postgres. - Mehboob Alam, POSTGRESNX, Inc.

Context Engineering for Multi-Agent Systems

2025-11-18 O'Reilly Amazon

book

Denis Rothman

software-development software-architecture architectural-patterns service-oriented-architecture-soa service-oriented architecture (soa) AI/ML

Build AI that thinks in context using semantic blueprints, multi-agent orchestration, memory, RAG pipelines, and safeguards to create your own Context Engine Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Design semantic blueprints to give AI structured, goal-driven contextual awareness Orchestrate multi-agent workflows with MCP for adaptable, context-rich reasoning Engineer a glass-box Context Engine with high-fidelity RAG, trust, and safeguards Book Description Generative AI is powerful, yet often unpredictable. This guide shows you how to turn that unpredictability into reliability by thinking beyond prompts and approaching AI like an architect. At its core is the Context Engine, a glass-box, multi-agent system you’ll learn to design and apply across real-world scenarios. Written by an AI guru and author of various cutting-edge AI books, this book takes you on a hands-on journey from the foundations of context design to building a fully operational Context Engine. Instead of relying on brittle prompts that give only simple instructions, you’ll begin with semantic blueprints that map goals and roles with precision, then orchestrate specialized agents using the Model Context Protocol. As the engine evolves, you’ll integrate memory and high-fidelity retrieval with citations, implement safeguards against data poisoning and prompt injection, and enforce moderation to keep outputs aligned with policy. You’ll also harden the system into a resilient architecture, then see it pivot across domains, from legal compliance to strategic marketing, proving its domain independence. By the end of this book, you’ll be equipped with the skills to engineer an adaptable, verifiable architecture you can repurpose across domains and deploy with confidence. Email sign-up and proof of purchase required What you will learn Develop memory models to retain short-term and cross-session context Craft semantic blueprints and drive multi-agent orchestration with MCP Implement high-fidelity RAG pipelines with verifiable citations Apply safeguards against prompt injection and data poisoning Enforce moderation and policy-driven control in AI workflows Repurpose the Context Engine across legal, marketing, and beyond Deploy a scalable, observable Context Engine in production Who this book is for This book is for AI engineers, software developers, system architects, and data scientists who want to move beyond ad hoc prompting and learn how to design structured, transparent, and context-aware AI systems. It will also appeal to ML engineers and solutions architects with basic familiarity with LLMs who are eager to understand how to orchestrate agents, integrate memory and retrieval, and enforce safeguards.

Pro Oracle GoldenGate 23ai for the DBA: Powering the Foundation of Data Integration and AI

2025-11-17 O'Reilly Amazon

book

Bobby Curtis

data data-engineering oracle-database-solutions AI/ML API AWS

Transform your data replication strategy into a competitive advantage with Oracle GoldenGate 23ai. This comprehensive guide delivers the practical knowledge DBAs and architects need to implement, optimize , and scale Oracle GoldenGate 23ai in production environments. Written by Oracle ACE Director Bobby Curtis, it blends deep technical expertise with real-world business insights from hundreds of implementations across manufacturing, financial services, and technology sectors. Beyond traditional replication, this book explores the groundbreaking capabilities that make GoldenGate 23ai essential for modern AI initiatives. Learn how to implement real-time vector replication for RAG systems, integrate with cloud platforms like GCP and Snowflake, and automate deployments using REST APIs and Python. Each chapter offers proven strategies to deliver measurable ROI while reducing operational risk. Whether you're upgrading from Classic GoldenGate , deploying your first cloud data pipeline, or building AI-ready data architectures, this book provides the strategic guidance and technical depth to succeed. With Bobby's signature direct approach, you'll avoid common pitfalls and implement best practices that scale with your business. What You Will Learn Master the microservices architecture and new capabilities of Oracle GoldenGate 23ai Implement secure, high-performance data replication across Oracle, PostgreSQL, and cloud databases Configure vector replication for AI and machine learning workloads, including RAG systems Design and build multi-master replication models with automatic conflict resolution Automate deployments and management using RESTful APIs and Python Optimize performance for sub-second replication lag in production environments Secure your replication environment with enterprise-grade features and compliance Upgrade from Classic to Microservices architecture with zero downtime Integrate with cloud platforms including OCI, GCP, AWS, and Azure Implement real-time data pipelines to BigQuery , Snowflake, and other cloud targets Navigate Oracle licensing models and optimize costs Who This Book Is For Database administrators, architects, and IT leaders working with Oracle GoldenGate —whether deploying for the first time, migrating from Classic architecture, or enabling AI-driven replication—will find actionable guidance on implementation, performance tuning, automation, and cloud integration. Covers unidirectional and multi-master replication and is packed with real-world use cases.

AI Systems Performance Engineering

2025-11-12 O'Reilly Amazon

book

Chris Fregly

data ai-ml artificial-intelligence-ai artificial intelligence (ai) AI/ML PyTorch

Elevate your AI system performance capabilities with this definitive guide to maximizing efficiency across every layer of your AI infrastructure. In today's era of ever-growing generative models, AI Systems Performance Engineering provides engineers, researchers, and developers with a hands-on set of actionable optimization strategies. Learn to co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems that excel in both training and inference. Authored by Chris Fregly, a performance-focused engineering and product leader, this resource transforms complex AI systems into streamlined, high-impact AI solutions. Inside, you'll discover step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems. You'll also master the art of scaling GPU clusters for high performance, distributed model training jobs, and inference servers. The book ends with a 175+-item checklist of proven, ready-to-use optimizations. Codesign and optimize hardware, software, and algorithms to achieve maximum throughput and cost savings Implement cutting-edge inference strategies that reduce latency and boost throughput in real-world settings Utilize industry-leading scalability tools and frameworks Profile, diagnose, and eliminate performance bottlenecks across complex AI pipelines Integrate full stack optimization techniques for robust, reliable AI system performance

Keep Safe Using Mobile Tech, 2nd Edition

2025-11-12 O'Reilly Amazon

book

Glenn Fleishman

data data-engineering data-security-privacy data security & privacy Cloud Computing Microsoft

Leverage your smartphone and smartwatch for improved personal safety! Version 2.0, updated November 12, 2025 The digital and “real” worlds can both be scary places. The smartphone (and often smartwatch) you already carry with you can help reduce risks, deter theft, and mitigate violence. This book teaches you to secure your hardware, block abuse, automatically call emergency services, connect with others to ensure you arrive where and when you intended, detect stalking by compact trackers, and keep your ecosystem accounts from Apple, Google, and Microsoft secure. You don’t have to be reminded of the virtual and physical risks you face every day. Some of us are targeted more than others. Modern digital features built into mobile operating systems (and some computer operating systems) can help reduce our anxiety by putting more power in our hands to deter, deflect, block, and respond to abuse, threats, and emergencies. Keep Safe Using Mobile Tech looks at both digital threats, like online abuse and account hijacking, and ones in the physical world, like being stalked through Bluetooth trackers, facing domestic violence, or being in a car crash. The book principally covers the iPhone, Apple Watch, Android devices, and Wear OS watches. It also covers more limited but useful features available on the iPad and on computers running macOS or Windows. This second edition incorporates the massive number of new safety features Google added since October 2024 to the Android operating system, some particular to Google Pixel phones and smartwatches, and improved blocking, filtering, and screening added to Apple’s iOS 26 and related operating system updates in fall 2025. This book explores many techniques to help:

Learn how to harden your Apple Account, Google Account, and Microsoft Account beyond just a password or a text-message token. Discover filtering and blocking tools from Apple and Google that can prevent abusive, fraudulent, and phishing messages and calls from reaching you. Block seeing unwanted sensitive images on your iPhone, iPad, Mac, Apple Watch, or Android phone—and help your kids receive advice on how not to send them. Turn on tracking on your Apple, Google, and Microsoft devices, and use it to recover or erase stolen hardware. Keep your cloud-archived messages from leaking to attackers. Screen calls with an automated assistant so that you know who wants you before picking up and without sending to voicemail. Lock down your devices to keep thieves and other personal invaders from accessing them. Prepare for emergencies by setting up medical information on your mobile devices. Let a supported smartphone or smartwatch recognize when you’re in a car crash or have taken a hard fall and call emergency services for you (and text your emergency contacts) if you can’t respond. Keep track of heart anomalies through smartwatch alerts and tests on your Apple Watch and many Android Wear smartwatches. Tell others where or when you expect to check in with them again, and let your smartphone alert them if you don’t with your Apple iPhone or Android phone. Deter stalking from tiny Bluetooth trackers. Protect your devices and accounts against access from domestic assailants. Block thieves who steal your phone—potentially threatening you or attacking you in person—from gaining access to the rest of your digital life.

Data Engineering for Beginners

2025-11-11 O'Reilly Amazon

book

Chisom Nwokwu

data data-engineering Big Data Cloud Computing Data Engineering Data Governance

A hands-on technical and industry roadmap for aspiring data engineers In Data Engineering for Beginners, big data expert Chisom Nwokwu delivers a beginner-friendly handbook for everyone interested in the fundamentals of data engineering. Whether you're interested in starting a rewarding, new career as a data analyst, data engineer, or data scientist, or seeking to expand your skillset in an existing engineering role, Nwokwu offers the technical and industry knowledge you need to succeed. The book explains: Database fundamentals, including relational and noSQL databases Data warehouses and data lakes Data pipelines, including info about batch and stream processing Data quality dimensions Data security principles, including data encryption Data governance principles and data framework Big data and distributed systems concepts Data engineering on the cloud Essential skills and tools for data engineering interviews and jobs Data Engineering for Beginners offers an easy-to-read roadmap on a seemingly complicated and intimidating subject. It addresses the topics most likely to cause a beginning data engineer to stumble, clearly explaining key concepts in an accessible way. You'll also find: A comprehensive glossary of data engineering terms Common and practical career paths in the data engineering industry An introduction to key cloud technologies and services you may encounter early in your data engineering career Perfect for practicing and aspiring data analysts, data scientists, and data engineers, Data Engineering for Beginners is an effective and reliable starting point for learning an in-demand skill. It's a powerful resource for everyone hoping to expand their data engineering Skillset and upskill in the big data era.

Fundamentals of Software Engineering

2025-11-06 O'Reilly Amazon

book

Dan Vega , Nathaniel Schutta

software-development coding-practices

What do you need to know to be a successful software engineer? Undergraduate curricula and bootcamps may teach the fundamentals of algorithms and writing code, but they rarely cover topics vital to your career advancement. With this practical book, you'll learn the skills you need to succeed and thrive. Authors Nathaniel Schutta and Dan Vega guide your journey with everything from pointers to deep dives into specific topic areas that will help you build the skills that really matter as a software engineer. Understand what software engineering is—and why communication and other soft skills matter Learn the basics of software architecture and architectural drivers Use common and proven techniques to read and refactor code bases Understand the importance of testing and how to implement an effective test suite Learn how to reliably and repeatedly deploy software Know how to evaluate and choose the right solution or tool for a given problem

Mastering Snowflake DataOps with DataOps.live: An End-to-End Guide to Modern Data Management

2025-10-30 O'Reilly Amazon

book

Ronald L. Steelman Jr.

data data-engineering Snowflake Data Management DataOps dbt

This practical, in-depth guide shows you how to build modern, sophisticated data processes using the Snowflake platform and DataOps.live —the only platform that enables seamless DataOps integration with Snowflake. Designed for data engineers, architects, and technical leaders, it bridges the gap between DataOps theory and real-world implementation, helping you take control of your data pipelines to deliver more efficient, automated solutions. . You’ll explore the core principles of DataOps and how they differ from traditional DevOps, while gaining a solid foundation in the tools and technologies that power modern data management—including Git, DBT, and Snowflake. Through hands-on examples and detailed walkthroughs, you’ll learn how to implement your own DataOps strategy within Snowflake and maximize the power of DataOps.live to scale and refine your DataOps processes. Whether you're just starting with DataOps or looking to refine and scale your existing strategies, this book—complete with practical code examples and starter projects—provides the knowledge and tools you need to streamline data operations, integrate DataOps into your Snowflake infrastructure, and stay ahead of the curve in the rapidly evolving world of data management. What You Will Learn Explore the fundamentals of DataOps , its differences from DevOps, and its significance in modern data management Understand Git’s role in DataOps and how to use it effectively Know why DBT is preferred for DataOps and how to apply it Set up and manage DataOps.live within the Snowflake ecosystem Apply advanced techniques to scale and evolve your DataOps strategy Who This Book Is For Snowflake practitioners—including data engineers, platform architects, and technical managers—who are ready to implement DataOps principles and streamline complex data workflows using DataOps.live.

Building Data Integration Solutions

2025-10-29 O'Reilly Amazon

book

Jay Borthen

data data-engineering integration-solutions

Are you struggling to manage and make sense of the vast streams of data flowing into your organization? In today's data-driven world, the ability to effectively unify and organize disparate data sources is not just an advantage—it's a necessity. The challenge lies in navigating the complexities of data diversity, volume, and regulatory demands, which can overwhelm even the most seasoned data professionals. In this essential book, Jay Borthen offers a comprehensive guide to understanding the art of data integration. This book dives deep into the processes and strategies necessary for creating effective data pipelines that ensure consistency, accuracy, and accessibility of your data. Whether you're a novice looking to understand the basics or an experienced professional aiming to refine your skills, Borthen's insights and practical advice, grounded in real-world case studies, will empower you to transform your organization's data handling capabilities. Understand various data integration solutions and how different technologies can be employed Gain insights into the relationship between data integration and the overall data life cycle Learn to effectively design, set up, and manage data integration components within pipelines Acquire the knowledge to configure pipelines, perform data migrations, transformations, and more

Apache Hudi: The Definitive Guide

2025-10-27 O'Reilly Amazon

book

Rebecca Bilbro , Prashant Wason , Bhavani Sudha Saktheeswaran , Shiyan Xu

data data-engineering Hadoop apache-hive Analytics Data Lakehouse

Overcome challenges in building transactional guarantees on rapidly changing data by using Apache Hudi. With this practical guide, data engineers, data architects, and software architects will discover how to seamlessly build an interoperable lakehouse from disparate data sources and deliver faster insights using your query engine of choice. Authors Shiyan Xu, Prashant Wason, Bhavani Sudha Saktheeswaran, and Rebecca Bilbro provide practical examples and insights to help you unlock the full potential of data lakehouses for different levels of analytics, from batch to interactive to streaming. You'll also learn how to evaluate storage choices and leverage built-in automated table optimizations to build, maintain, and operate production data applications. Understand the need for transactional data lakehouses and the challenges associated with building them Explore data ecosystem support provided by Apache Hudi for popular data sources and query engines Perform different write and read operations on Apache Hudi tables and effectively use them for various use cases, including batch and stream applications Apply different storage techniques and considerations such as indexing and clustering to maximize your lakehouse performance Build end-to-end incremental data pipelines using Apache Hudi for faster ingestion and fresher analytics

Crafting Engineering Strategy

2025-10-21 O'Reilly Amazon

book

Will Larson

math-science-engineering engineering API

Many engineers assume their organization doesn't have an engineering strategy—when in fact, they often do. It just may not be working. In Crafting Engineering Strategy, Will Larson (author of An Elegant Puzzle, Staff Engineer, and The Engineering Executive's Primer) offers a practical, example-rich guide to navigating technical and organizational complexity through structured, intentional strategy. Written for senior engineers, engineering leaders, architects, and curious collaborators, this book lays out a repeatable process for building effective, actionable strategies—from early diagnosis to rollout. With lessons drawn from real-world case studies at companies like Stripe, Uber, and Calm, Larson provides a framework for shaping critical decisions around system migrations, API deprecations, platform investments, and more. Along the way, you'll learn to augment technical planning with communication, governance, and systems thinking. Whether you're shaping your team's direction or leading a company-wide initiative, Crafting Engineering Strategy will help you make thoughtful decisions that stick. Build durable engineering strategies from first principles Apply methods like Wardley mapping and systems modeling Lead strategy as a staff+ engineer or executive Learn from detailed case studies across industries Improve your strategic fluency and influence over time

FinOps for Snowflake: A Guide to Cloud Financial Optimization

2025-10-21 O'Reilly Amazon

book

Velu Natarajan , Y V Ravi Kumar , Parag Bhardwaj

data data-engineering Snowflake AI/ML Cloud Computing FinOps

Unlock the full financial potential of your Snowflake environment. Learn how to cut costs, boost performance, and take control of your cloud data spend with FinOps for Snowflake—your essential guide to implementing a smart, automated, and Snowflake-optimized FinOps strategy. In today’s data-driven world, financial optimization on platforms like Snowflake is more critical than ever. Whether you're just beginning your FinOps journey or refining mature practices, this book provides a practical roadmap to align Snowflake usage with business goals, reduce costs, and improve performance—without compromising agility. Grounded in real-world case studies and packed with actionable strategies, FinOps for Snowflake shows how leading organizations are transforming their environments through automation, governance, and cost intelligence. You'll learn how to apply proven techniques for architecture tuning, workload and storage efficiency, and performance optimization—empowering you to make smarter, data-driven decisions. What You Will Learn Master FinOps principles tailored for Snowflake’s architecture and pricing model Enable collaboration across finance, engineering, and business teams Deliver real-time cost insights for smarter decision-making Optimize compute, storage, and Snowflake AI and ML services for efficiency Leverage Snowflake Cortex AI and Adoptive Warehouse/Compute for intelligent cost governance Apply proven strategies to achieve operational excellence and measurable savings Who this Book is For Data professionals, cloud engineers, FinOps practitioners, and finance teams seeking to improve cost visibility, operational efficiency, and financial accountability in Snowflake environments.

The SAP Fiori Handbook: A Step-By-Step Guide to SAP Fiori Essentials

2025-10-21 O'Reilly Amazon

book

Manpreet S. Brara , Subba Rao M.V. Parvathaneni

data data-engineering SAP

The SAP Fiori Handbook is your one-stop-shop to turbo charge your UX skills to ensure your enterprise applications are more user-friendly and accessible. This handbook is broadly divided into four sections and provides you with an in-depth exploration of the SAP Fiori system with chapters offering a theoretical context as well as detailed, step-by-step explanations of the key concepts providing you with a systematic approach to deepen your understanding of the SAP Fiori environment. The book will cover everything from introductory concepts and installation before moving through the key elements of the SAP Fiori system , from the Fiori App, Launchpad Content Manager and SAP Fiori UI. We will also cover important topics like app support and troubleshooting and diving into SAP Fiori Reports too. You Will: Explore the entire SAP Fiori eco-system [endif]Learn to configure and manage SAP Fiori Launchpad content [endif]See how to create custom apps and technical catalogs [endif]Explore how to implement Spaces and Pages Understand how to use App Support Functionality for troubleshooting Explore how to configure and manage Catalogs and Groups using SAP Fiori Launchpad Designer Understand how to convert existing Groups to Pages Get to grips with Fiori Apps recommendation report as well as SAP Fiori Upgrade Impact Analysis Report Who is this Book for: SAP Fiori administrators, consultants and Business Analysts as well as anyone responsible for configuring and maintaining the SAP Fiori launchpad experience for their company.

Advanced Snowflake

2025-10-02 O'Reilly Amazon

book

Muhammad Fasih Ullah

data data-engineering Snowflake AI/ML Analytics Iceberg

As Snowflake's capabilities expand, staying updated with its latest features and functionalities can be overwhelming. The platform's rapid development gave rise to advanced tools like Snowpark and the Native App Framework, which are crucial for optimizing data operations but may seem complex to navigate. In this essential book, author Muhammad Fasih Ullah offers a detailed guide to understanding these sophisticated tools, ensuring you can leverage the full potential of Snowflake for data processing, application development, and deploying machine learning models at scale. You'll gain actionable insights and structured examples to transform your understanding and skills in handling advanced data scenarios within Snowflake. By the end of this book, you will: Grasp advanced features such as Snowpark, Snowflake Native App Framework, and Iceberg tables Enhance your projects with geospatial functions for comprehensive geospatial analytics Interact with Snowflake using a variety of programming languages through Snowpark Implement and manage machine learning models effectively using Snowpark ML Develop and deploy applications within the Snowflake environment

Mastering PostgreSQL Administration: Internals, Operations, Monitoring, and Oracle Migration Strategies

2025-09-30 O'Reilly Amazon

book

Arun Kumar Samayam , Y V Ravi Kumar , Phani Kadambari

data data-engineering relational-databases postgresql Analytics Grafana

This book is your one-stop resource on PostgreSQL system architecture, installation, management, maintenance, and migration. It will help you address the critical needs driving successful database management today: reliability and availability, performance and scalability, security and compliance, cost-effectiveness and flexibility, disaster recovery, and real-time analytics—all in one volume. Each topic in the book is thoroughly explained by industry experts and includes step-by-step instructions for configuring the features, a discussion of common issues and their solutions, and an exploration of real-world scenarios and case studies that illustrate how concepts work in practice. You won't find the book's comprehensive coverage of advanced topics, including migration from Oracle to PostgreSQL, heterogeneous replication, and backup & recovery, in one place—online or anywhere else. What You Will Learn Install PostgreSQL using source code and yum installation Back up and recover Migrate from Oracle database to PostgreSQL using ora2pg utility Replicate from PostgreSQL to Oracle database and vice versa using Oracle GoldenGate Monitor using Grafana, PGAdmin, and command line tools Maintain with VACUUM, REINDEX, etc. Who This Book Is For Intermediate and advanced PostgreSQL users, including PostgreSQL administrators, architects, developers, analysts, disaster recovery system engineers, high availability engineers, and migration engineers

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

2025-09-30 O'Reilly Amazon

book

Dustin Dorsey , Cameron Cyr

data data-engineering storage-repositories data-warehouse CI/CD Cloud Computing

Master the art of data transformation with the second edition of this trusted guide to dbt. Building on the foundation of the first edition, this updated volume offers a deeper, more comprehensive exploration of dbt’s capabilities—whether you're new to the tool or looking to sharpen your skills. It dives into the latest features and techniques, equipping you with the tools to create scalable, maintainable, and production-ready data transformation pipelines. Unlocking dbt, Second Edition introduces key advancements, including the semantic layer, which allows you to define and manage metrics at scale, and dbt Mesh, empowering organizations to orchestrate decentralized data workflows with confidence. You’ll also explore more advanced testing capabilities, expanded CI/CD and deployment strategies, and enhancements in documentation—such as the newly introduced dbt Catalog. As in the first edition, you’ll learn how to harness dbt’s power to transform raw data into actionable insights, while incorporating software engineering best practices like code reusability, version control, and automated testing. From configuring projects with the dbt Platform or open source dbt to mastering advanced transformations using SQL and Jinja, this book provides everything you need to tackle real-world challenges effectively. What You Will Learn Understand dbt and its role in the modern data stack Set up projects using both the cloud-hosted dbt Platform and open source project Connect dbt projects to cloud data warehouses Build scalable models in SQL and Python Configure development, testing, and production environments Capture reusable logic with Jinja macros Incorporate version control with your data transformation code Seamlessly connect your projects using dbt Mesh Build and manage a semantic layer using dbt Deploy dbt using CI/CD best practices Who This Book Is For Current and aspiring data professionals, including architects, developers, analysts, engineers, data scientists, and consultants who are beginning the journey of using dbt as part of their data pipeline’s transformation layer. Readers should have a foundational knowledge of writing basic SQL statements, development best practices, and working with data in an analytical context such as a data warehouse.

Modernizing SAP with AWS: A Comprehensive Journey to Cloud Migration, Architecture, and Innovation Strategies

2025-09-26 O'Reilly Amazon

book

Tushar Srivastava

data data-engineering SAP AWS Cloud Computing Cyber Security

Follow the cloud journey of a fictional company Nimbus Airlines and the process it goes through to modernize its SAP systems. This book provides a detailed guide for those looking to transition their SAP systems to the cloud using Amazon Web Services (AWS). Through the lens of various characters, the book is structured in three parts — starting with an introduction to SAP and AWS fundamentals, followed by technical architecture insights, and concluding with migration strategies and case studies, the book covers technical aspects of modernizing SAP with AWS. You’ll review the partnership between SAP and AWS, highlighted by their long-standing collaboration and shared innovations. Then design an AWS architecture tailored for SAP workloads, including high availability, disaster recovery, and operations automation. The book concludes with a tour of the migration process, offering various strategies, tools, and frameworks reinforced with real-world customer case studies that showcase successful SAP migrations to AWS. Modernizing SAP with AWS equips business leaders and technical architects with the knowledge to leverage AWS for their SAP systems, ensuring a smooth transition and unlocking new opportunities for innovation. What You Will Learn Understand the fundamentals of AWS and its key components, including computing, storage, networking, and microservices, for SAP systems. Explore the technical partnership between SAP and AWS, learning how their collaboration drives innovation and delivers business value. Design an optimized AWS architecture for SAP workloads, focusing on high availability, disaster recovery, and operations automation. Discover innovative ways to enhance and extend SAP functionality using AWS tools for better system performance and automation. Who This Book Is For SAP professionals and consultants interested in learning how AWS can enhance SAP performance, security, and automation. Cloud engineers and developers involved in SAP migration projects, looking for best practices and real-world case studies for successful implementation. Enterprise architects seeking to design optimized, scalable, and secure SAP infrastructure on AWS. CIOs, CTOs, and IT managers aiming to modernize SAP systems and unlock innovation through cloud technology.

Understanding ETL (Updated Edition)

2025-09-25 O'Reilly Amazon

book

Matt Palmer

data data-engineering etl AI/ML BI Data Lakehouse

"Extract, transform, load" (ETL) is at the center of every application of data, from business intelligence to AI. Constant shifts in the data landscape—including the implementations of lakehouse architectures and the importance of high-scale real-time data—mean that today's data practitioners must approach ETL a bit differently. This updated technical guide offers data engineers, engineering managers, and architects an overview of the modern ETL process, along with the challenges you're likely to face and the strategic patterns that will help you overcome them. You'll come away equipped to make informed decisions when implementing ETL and confident about choosing the technology stack that will help you succeed. Discover what ETL looks like in the new world of data lakehouses Learn how to deal with real-time data Explore low-code ETL tools Understand how to best achieve scale, performance, and observability

Apache Polaris: The Definitive Guide

2025-09-17 O'Reilly Amazon

book

Tomer Shiran , Alex Merced , Andrew Madson

data data-engineering storage-repositories data-lake apache-iceberg Data Lakehouse

Revolutionize your understanding of modern data management with Apache Polaris (incubating), the open source catalog designed for data lakehouse industry standard Apache Iceberg. This comprehensive guide takes you on a journey through the intricacies of Apache Iceberg data lakehouses, highlighting the pivotal role of Iceberg catalogs. Authors Alex Merced, Andrew Madson, and Tomer Shiran explore Apache Polaris's architecture and features in detail, equipping you with the knowledge needed to leverage its full potential. Data engineers, data architects, data scientists, and data analysts will learn how to seamlessly integrate Apache Polaris with popular data tools like Apache Spark, Snowflake, and Dremio to enhance data management capabilities, optimize workflows, and secure datasets. Get a comprehensive introduction to Iceberg data lakehouses Understand how catalogs facilitate efficient data management and querying in Iceberg Explore Apache Polaris's unique architecture and its powerful features Deploy Apache Polaris locally, and deploy managed Apache Polaris from Snowflake and Dremio Perform basic table operations on Apache Spark, Snowflake, and Dremio

High Performance with MongoDB

2025-09-05 O'Reilly Amazon

book

Ger Hartnett , Asya Kamsky , Alex Bevilacqua

data data-engineering nosql-databases MongoDB DevOps

Practical strategies to help you design, optimize, and operate MongoDB deployments for performance, resilience, and growth Key Features Identify and fix performance bottlenecks with practical diagnostic and optimization strategies Optimize schema design, indexing, storage, and system resources for real-world workloads Scale confidently with in-depth coverage of replication, sharding, and cluster management techniques Purchase of the print or Kindle book includes a free PDF eBook Book Description With data as the new competitive edge, performance has become the need of the hour. As applications handle exponentially growing data and user demand for speed and reliability rises, three industry experts distill their decades of experience to offer you guidance on designing, building, and operating databases that deliver fast, scalable, and resilient experiences. MongoDB’s document model and distributed architecture provide powerful tools for modern applications, but unlocking their full potential requires a deep understanding of architecture, operational patterns, and tuning best practices. This MongoDB book takes a hands-on approach to diagnosing common performance issues and applying proven optimization strategies from schema design and indexing to storage engine tuning and resource management. Whether you’re optimizing a single replica set or scaling a sharded cluster, this book provides the tools to maximize deployment performance. Its modular chapters let you explore query optimization, connection management, and monitoring or follow a complete learning path to build a rock-solid performance foundation. With real-world case studies, code examples, and proven best practices, you’ll be ready to troubleshoot bottlenecks, scale efficiently, and keep MongoDB running at peak performance in even the most demanding production environments. What you will learn Diagnose and resolve common performance bottlenecks in deployments Design schemas and indexes that maximize throughput and efficiency Tune the WiredTiger storage engine and manage system resources for peak performance Leverage sharding and replication to scale and ensure uptime Monitor, debug, and maintain deployments proactively to prevent issues Improve application responsiveness through client driver configuration Who this book is for This book is for developers, database administrators, system architects, and DevOps engineers focused on performance optimization of MongoDB. Whether you’re building high-throughput applications, managing deployments in production, or scaling distributed systems, you’ll gain actionable insights. Basic knowledge of MongoDB is assumed, with chapters designed progressively to support learners at all levels.

MongoDB Essentials

2025-09-05 O'Reilly Amazon

book

The MongoDB Team

data data-engineering nosql-databases MongoDB AI/ML Data Modelling

Get started fast with MongoDB architecture, core operations, and AI-powered tools for building intelligent applications Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Quickly grasp the MongoDB architecture and distributed design principles Learn practical data modeling, CRUD operations, and aggregation techniques Explore AI-enabled tools for building intelligent applications with MongoDB Purchase of the print or Kindle book includes a free PDF eBook Book Description Modern applications demand flexibility, speed, and intelligence, and MongoDB delivers all three. This mini guide wastes no time, offering a concise, practical introduction to handling data flexibly and efficiently with MongoDB. MongoDB Essentials helps developers, architects, database administrators, and decision makers get started quickly and confidently. The book introduces MongoDB’s core principles, from the document data model to its distributed architecture, including replica sets and sharding. It then helps you build hands-on skills such as installing MongoDB, designing effective data schemas, performing CRUD operations, and working with the aggregation pipeline. You’ll discover performance tips along the way and learn how AI-enhanced tools like Atlas Search and Atlas Vector Search power intelligent application development. With clear explanations and a practical approach, this book gives you the foundation and skills you need to start working with MongoDB right away. Email sign-up and proof of purchase required What you will learn Understand MongoDB's document model and architecture Set up local MongoDB deployments quickly Design schemas tailored to application access patterns Perform CRUD and aggregation operations efficiently Use tools to optimize query performance and scalability Explore AI-powered features such as Atlas Search and Atlas Vector Search Who this book is for This book is for anyone looking to explore MongoDB, including students, developers, system architects, managers, database administrators, and decision makers who want to familiarize themselves with what a modern database can offer. Whether you're building your first application or exploring what MongoDB can do for you, this book is the idea starting point for your MongoDB journey.

The Official MongoDB Guide

2025-09-05 O'Reilly Amazon

book

Alison Huh , Jeffrey Allen , Maya Raman , Parker Faucher , Lauren Tran , Lander Kerbey , Rachelle Palmer

data data-engineering nosql-databases MongoDB AI/ML API

The official guide to MongoDB architecture, tools, and cloud features, written by leading MongoDB subject matter experts to help you build secure, scalable, high-performance applications Key Features Design resilient, secure solutions with high performance and scalability Streamline development with modern tooling, indexing, and AI-powered workflows Deploy and optimize in the cloud using advanced MongoDB Atlas features Purchase of the print or Kindle book includes a free PDF eBook Book Description Delivering secure, scalable, and high-performance applications is never easy, especially when systems must handle growth, protect sensitive data, and perform reliably under pressure. The Official MongoDB Guide addresses these challenges with guidance from MongoDB’s top subject matter experts, so you learn proven best practices directly from those who know the technology inside out. This book takes you from core concepts and architecture through to advanced techniques for data modeling, indexing, and query optimization, supported by real-world patterns that improve performance and resilience. It offers practical coverage of developer tooling, IDE integrations, and AI-assisted workflows that will help you work faster and more effectively. Security-focused chapters walk you through authentication, authorization, encryption, and compliance, while chapters dedicated to MongoDB Atlas showcase its robust security features and demonstrate how to deploy, scale, and leverage platform-native capabilities such as Atlas Search and Atlas Vector Search. By the end of this book, you’ll be able to design, build, and manage MongoDB applications with the confidence that comes from learning directly from the experts shaping the technology. What you will learn Build secure, scalable, and high-performance applications Design efficient data models and indexes for real workloads Write powerful queries to sort, filter, and project data Protect applications with authentication and encryption Accelerate coding with AI-powered and IDE-based tools Launch, scale, and manage MongoDB Atlas with confidence Unlock advanced features like Atlas Search and Atlas Vector Search Apply proven techniques from MongoDB's own engineering leaders Who this book is for This book is for developers, database professionals, architects, and platform teams who want to get the most out of MongoDB. Whether you’re building web apps, APIs, mobile services, or backend systems, the concepts covered here will help you structure data, improve performance, and deliver value to your users. No prior experience with MongoDB is required, but familiarity with databases and programming will be helpful.

Data Modeling with Snowflake - Second Edition

2025-09-02 O'Reilly Amazon

book

Serge Gershkovich

data data-engineering Snowflake Cloud Computing Data Management Data Modelling

Data Modeling with Snowflake provides a clear and practical guide to mastering data modeling tailored to the Snowflake Data Cloud. By integrating foundational principles of database modeling with Snowflake's unique features and functionality, this book empowers you to create scalable, cost-effective, and high-performing data solutions. What this Book will help me do Apply universal data modeling concepts within the Snowflake platform effectively. Leverage Snowflake's features such as Time Travel and Zero-Copy Cloning for optimized data solutions. Understand and utilize advanced techniques like Data Vault and Data Mesh for scalable data architecture. Master handling semi-structured data in Snowflake using practical recipes and examples. Achieve cost efficiency and resource optimization by aligning modeling principles with Snowflake's architecture. Author(s) Serge Gershkovich is an accomplished data engineer and seasoned professional in data architecture and modeling. With a passion for simplifying complex concepts, Serge's work leverages his years of hands-on experience to guide readers in mastering both foundational and advanced data management practices. His clear and practical approach ensures accessibility for all levels. Who is it for? This book is ideal for data developers and engineers seeking practical modeling guidance within Snowflake. It's suitable for data analysts looking to broaden their database design expertise, and for database beginners aiming to get a head start in structuring data. Professionals new to Snowflake will also find its clear explanations of key features aligned with modeling techniques invaluable.

talk-data.com

O'Reilly Data Engineering Books

Top Topics

Top Speakers

Hands-On Software Engineering with Python - Second Edition

Building a Data and AI Platform with PostgreSQL

Just Use Postgres!

Context Engineering for Multi-Agent Systems

Pro Oracle GoldenGate 23ai for the DBA: Powering the Foundation of Data Integration and AI

AI Systems Performance Engineering

Keep Safe Using Mobile Tech, 2nd Edition

Data Engineering for Beginners

Fundamentals of Software Engineering

Mastering Snowflake DataOps with DataOps.live: An End-to-End Guide to Modern Data Management

Building Data Integration Solutions

Apache Hudi: The Definitive Guide

Crafting Engineering Strategy

FinOps for Snowflake: A Guide to Cloud Financial Optimization

The SAP Fiori Handbook: A Step-By-Step Guide to SAP Fiori Essentials

Advanced Snowflake

Mastering PostgreSQL Administration: Internals, Operations, Monitoring, and Oracle Migration Strategies

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

Modernizing SAP with AWS: A Comprehensive Journey to Cloud Migration, Architecture, and Innovation Strategies

Understanding ETL (Updated Edition)

Apache Polaris: The Definitive Guide

High Performance with MongoDB

MongoDB Essentials

The Official MongoDB Guide

Data Modeling with Snowflake - Second Edition