talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

407

Collection of O'Reilly books on Data Engineering.

Filtering by: Microsoft ×

Sessions & talks

Showing 1–25 of 407 · Newest first

Search within this event →
The Data Engineer's Guide to Microsoft Fabric

Modern data engineering is evolving; and with Microsoft Fabric, the entire data platform experience is being redefined. This essential book offers a fresh, hands-on approach to navigating this shift. Rather than being an introduction to features, this guide explains how Fabric's key components—Lakehouse, Warehouse, and Real-Time Intelligence—work under the hood and how to put them to use in realistic workflows. Written by Christian Henrik Reich, a data engineering expert with experience that extends from Databricks to Fabric, this book is a blend of foundational theory and practical implementation of lakehouse solutions in Fabric. You'll explore how engines like Apache Spark and Fabric Warehouse collaborate with Fabric's Real-Time Intelligence solution in an integrated platform, and how to build ETL/ELT pipelines that deliver on speed, accuracy, and scale. Ideal for both new and practicing data engineers, this is your entry point into the fabric of the modern data platform. Acquire a working knowledge of lakehouses, warehouses, and streaming in Fabric Build resilient data pipelines across real-time and batch workloads Apply Python, Spark SQL, T-SQL, and KQL within a unified platform Gain insight into architectural decisions that scale with data needs Learn actionable best practices for engineering clean, efficient, governed solutions

Just Use Postgres!

You probably don’t need a collection of specialty databases. Just use Postgres instead! Written for application developers and database pros, Just Use Postgres! shows you how to get the most out of the powerful Postgres database. In Just Use Postgres! you’ll learn how to: Use Postgres as an RDBMS for transactional workloads Develop generative AI, geospatial, and time-series applications Take advantage of modern SQL including window functions and CTEs Perform full-text search and process JSON documents Use Postgres as a message queue Optimize performance with various index types including B-trees, GIN, GiST, HNSW, and more Over the decades, PostgreSQL, aka Postgres, has grown into the most powerful general-purpose database and has become the de facto standard for developers worldwide. Just Use Postgres! takes a modern look at Postgres, exploring the database’s most up-to-date features for AI, time-series, full-text search, geospatial, and other application workloads. About the Technology You know that PostgreSQL is a fast, reliable, SQL compliant RDBMS. You may not know that it’s also great for geospatial systems, time series, full-text search, JSON documents, AI vector embeddings, and many other specialty database functions. For almost any data task you can imagine, you can use Postgres. About the Book Just Use Postgres! covers recipes for using Postgres in dozens of applications normally reserved for single-purpose databases. Written for busy application developers, each chapter explores a different use case illuminating the breadth and depth of Postgres’s capabilities. Along the way, you’ll also meet an incredible ecosystem of Postgres extensions like pgvector, PostGIS, pgmq, and TimescaleDB. You’ll be amazed at everything you can accomplish with Postgres! What's Inside Generative AI, geospatial, and time-series applications Modern SQL including window functions and CTEs Full-text search and JSON B-trees, GIN, GiST, HNSW, and more About the Reader For application developers, software engineers, and architects who know the basics of SQL. About the Author Denis Magda is a recognized Postgres expert and software engineer who worked on Java at Sun Microsystems and Oracle before focusing on databases and large-scale distributed systems. Quotes I was pleasantly surprised to learn many new things from this book. - From the Afterword by Vlad Mihalcea An excellent guide covering everything from basics to cutting-edge features. - Dave Cramer, PostgreSQL JDBC Maintainer Pleasant, easy to read with tonnes of great code. - Mike McQuillan, McQTech Ltd Well-organized and easy to search. - Edward Pollack, Microsoft Data Platform MVP The missing guide to understanding and using Postgres. - Mehboob Alam, POSTGRESNX, Inc.

Keep Safe Using Mobile Tech, 2nd Edition

Leverage your smartphone and smartwatch for improved personal safety! Version 2.0, updated November 12, 2025 The digital and “real” worlds can both be scary places. The smartphone (and often smartwatch) you already carry with you can help reduce risks, deter theft, and mitigate violence. This book teaches you to secure your hardware, block abuse, automatically call emergency services, connect with others to ensure you arrive where and when you intended, detect stalking by compact trackers, and keep your ecosystem accounts from Apple, Google, and Microsoft secure. You don’t have to be reminded of the virtual and physical risks you face every day. Some of us are targeted more than others. Modern digital features built into mobile operating systems (and some computer operating systems) can help reduce our anxiety by putting more power in our hands to deter, deflect, block, and respond to abuse, threats, and emergencies. Keep Safe Using Mobile Tech looks at both digital threats, like online abuse and account hijacking, and ones in the physical world, like being stalked through Bluetooth trackers, facing domestic violence, or being in a car crash. The book principally covers the iPhone, Apple Watch, Android devices, and Wear OS watches. It also covers more limited but useful features available on the iPad and on computers running macOS or Windows. This second edition incorporates the massive number of new safety features Google added since October 2024 to the Android operating system, some particular to Google Pixel phones and smartwatches, and improved blocking, filtering, and screening added to Apple’s iOS 26 and related operating system updates in fall 2025. This book explores many techniques to help:

Learn how to harden your Apple Account, Google Account, and Microsoft Account beyond just a password or a text-message token. Discover filtering and blocking tools from Apple and Google that can prevent abusive, fraudulent, and phishing messages and calls from reaching you. Block seeing unwanted sensitive images on your iPhone, iPad, Mac, Apple Watch, or Android phone—and help your kids receive advice on how not to send them. Turn on tracking on your Apple, Google, and Microsoft devices, and use it to recover or erase stolen hardware. Keep your cloud-archived messages from leaking to attackers. Screen calls with an automated assistant so that you know who wants you before picking up and without sending to voicemail. Lock down your devices to keep thieves and other personal invaders from accessing them. Prepare for emergencies by setting up medical information on your mobile devices. Let a supported smartphone or smartwatch recognize when you’re in a car crash or have taken a hard fall and call emergency services for you (and text your emergency contacts) if you can’t respond. Keep track of heart anomalies through smartwatch alerts and tests on your Apple Watch and many Android Wear smartwatches. Tell others where or when you expect to check in with them again, and let your smartphone alert them if you don’t with your Apple iPhone or Android phone. Deter stalking from tiny Bluetooth trackers. Protect your devices and accounts against access from domestic assailants. Block thieves who steal your phone—potentially threatening you or attacking you in person—from gaining access to the rest of your digital life.

Apache Kafka in Action

Apache Kafka, start to finish. Apache Kafka in Action: From basics to production guides you through the concepts and skills you’ll need to deploy and administer Kafka for data pipelines, event-driven applications, and other systems that process data streams from multiple sources. Authors Anatoly Zelenin and Alexander Kropp have spent years using Kafka in real-world production environments. In this guide, they reveal their hard-won expert insights to help you avoid common Kafka pitfalls and challenges. Inside Apache Kafka in Action you’ll discover: Apache Kafka from the ground up Achieving reliability and performance Troubleshooting Kafka systems Operations, governance, and monitoring Kafka use cases, patterns, and anti-patterns Clear, concise, and practical, Apache Kafka in Action is written for IT operators, software engineers, and IT architects working with Kafka every day. Chapter by chapter, it guides you through the skills you need to deliver and maintain reliable and fault-tolerant data-driven applications. About the Technology Apache Kafka is the gold standard streaming data platform for real-time analytics, event sourcing, and stream processing. Acting as a central hub for distributed data, it enables seamless flow between producers and consumers via a publish-subscribe model. Kafka easily handles millions of events per second, and its rock-solid design ensures high fault tolerance and smooth scalability. About the Book Apache Kafka in Action is a practical guide for IT professionals who are integrating Kafka into data-intensive applications and infrastructures. The book covers everything from Kafka fundamentals to advanced operations, with interesting visuals and real-world examples. Readers will learn to set up Kafka clusters, produce and consume messages, handle real-time streaming, and integrate Kafka into enterprise systems. This easy-to-follow book emphasizes building reliable Kafka applications and taking advantage of its distributed architecture for scalability and resilience. What's Inside Master Kafka’s distributed streaming capabilities Implement real-time data solutions Integrate Kafka into enterprise environments Build and manage Kafka applications Achieve fault tolerance and scalability About the Reader For IT operators, software architects and developers. No experience with Kafka required. About the Authors Anatoly Zelenin is a Kafka expert known for workshops across Europe, especially in banking and manufacturing. Alexander Kropp specializes in Kafka and Kubernetes, contributing to cloud platform design and monitoring. Quotes A great introduction. Even experienced users will go back to it again and again. - Jakub Scholz, Red Hat Approachable, practical, well-illustrated, and easy to follow. A must-read. - Olena Kutsenko, Confluent A zero to hero journey to understanding and using Kafka! - Anthony Nandaa, Microsoft Thoughtfully explores a wide range of topics. A wealth of valuable information seamlessly presented and easily accessible. - Olena Babenko, Aiven Oy

Microsoft 365 Access For Dummies, 2nd Edition

Join the millions of people already using Microsoft Access and become a database power-user in no time! In the newly revised edition of Microsoft Access For Dummies, professional database developer and Access extraordinaire Laurie Ulrich-Fuller walks you through the ins-and-outs of one of the world's most popular database platforms. This is the perfect beginner's guide to Microsoft Access, showing you how to create databases, extract data, create reports, and more. The author demonstrates a ton of tips, tricks, and best practices you can use immediately to create, maintain, and improve your databases. You'll also find: Updates outlining edge browser controls in forms Step-by-step guides explaining how to import, export, and edit data Easy-to-follow query-writing tutorials to help you find the exact data you're looking for when you need it Whether you're a database novice or a data science whiz, Microsoft Access For Dummies has the info you need to supercharge your database skills. It's the perfect, how-to guide to get you up-to-speed on everything you need to know to get started with Microsoft's world-famous database app.

Snowflake Recipes: A Problem-Solution Approach to Implementing Modern Data Pipelines

Explore Snowflake’s core concepts and unique features that differentiates it from industry competitors, such as, Azure Synapse and Google BigQuery. This book provides recipes for architecting and developing modern data pipelines on the Snowflake data platform by employing progressive techniques, agile practices, and repeatable strategies. You’ll walk through step-by-step instructions on ready-to-use recipes covering a wide range of the latest development topics. Then build scalable development pipelines and solve specific scenarios common to all modern data platforms, such as, data masking, object tagging, data monetization, and security best practices. Throughout the book you’ll work with code samples for Amazon Web Services, Microsoft Azure, and Google Cloud Platform. There’s also a chapter devoted to solving machine learning problems with Snowflake. Authors Dillon Dayton and John Eipe are both Snowflake SnowPro Core certified, specializing in data and digital services, and understand the challenges of finding the right solution to complex problems. The recipes in this book are based on real world use cases and examples designed to help you provide quality, performant, and secured data to solve business initiatives. What You’ll Learn Handle structured and un- structured data in Snowflake. Apply best practices and different options for data transformation. Understand data application development. Implement data sharing, data governance and security. Who This book Is For Data engineers, scientists and analysts moving into Snowflake, looking to build data apps. This book expects basic knowledge in Cloud (AWS or Azure or GCP), SQL and Python

Snowflake Data Engineering

A practical introduction to data engineering on the powerful Snowflake cloud data platform. Data engineers create the pipelines that ingest raw data, transform it, and funnel it to the analysts and professionals who need it. The Snowflake cloud data platform provides a suite of productivity-focused tools and features that simplify building and maintaining data pipelines. In Snowflake Data Engineering, Snowflake Data Superhero Maja Ferle shows you how to get started. In Snowflake Data Engineering you will learn how to: Ingest data into Snowflake from both cloud and local file systems Transform data using functions, stored procedures, and SQL Orchestrate data pipelines with streams and tasks, and monitor their execution Use Snowpark to run Python code in your pipelines Deploy Snowflake objects and code using continuous integration principles Optimize performance and costs when ingesting data into Snowflake Snowflake Data Engineering reveals how Snowflake makes it easy to work with unstructured data, set up continuous ingestion with Snowpipe, and keep your data safe and secure with best-in-class data governance features. Along the way, you’ll practice the most important data engineering tasks as you work through relevant hands-on examples. Throughout, author Maja Ferle shares design tips drawn from her years of experience to ensure your pipeline follows the best practices of software engineering, security, and data governance. About the Technology Pipelines that ingest and transform raw data are the lifeblood of business analytics, and data engineers rely on Snowflake to help them deliver those pipelines efficiently. Snowflake is a full-service cloud-based platform that handles everything from near-infinite storage, fast elastic compute services, inbuilt AI/ML capabilities like vector search, text-to-SQL, code generation, and more. This book gives you what you need to create effective data pipelines on the Snowflake platform. About the Book Snowflake Data Engineering guides you skill-by-skill through accomplishing on-the-job data engineering tasks using Snowflake. You’ll start by building your first simple pipeline and then expand it by adding increasingly powerful features, including data governance and security, adding CI/CD into your pipelines, and even augmenting data with generative AI. You’ll be amazed how far you can go in just a few short chapters! What's Inside Ingest data from the cloud, APIs, or Snowflake Marketplace Orchestrate data pipelines with streams and tasks Optimize performance and cost About the Reader For software developers and data analysts. Readers should know the basics of SQL and the Cloud. About the Author Maja Ferle is a Snowflake Subject Matter Expert and a Snowflake Data Superhero who holds the SnowPro Advanced Data Engineer and the SnowPro Advanced Data Analyst certifications. Quotes An incredible guide for going from zero to production with Snowflake. - Doyle Turner, Microsoft A must-have if you’re looking to excel in the field of data engineering. - Isabella Renzetti, Data Analytics Consultant & Trainer Masterful! Unlocks the true potential of Snowflake for modern data engineers. - Shankar Narayanan, Microsoft Valuable insights will enhance your data engineering skills and lead to cost-effective solutions. A must read! - Frédéric L’Anglais, Maxa Comprehensive, up-to-date and packed with real-life code examples. - Albert Nogués, Danone

Azure SQL Revealed: The Next-Generation Cloud Database with AI and Microsoft Fabric

Access detailed content and examples on Azure SQL, a set of cloud services that allows for SQL Server to be deployed in the cloud. This book teaches the fundamentals of deployment, configuration, security, performance, and availability of Azure SQL from the perspective of these same tasks and capabilities in SQL Server. This distinct approach makes this book an ideal learning platform for readers familiar with SQL Server on-premises who want to migrate their skills toward providing cloud solutions to an enterprise market that is increasingly cloud-focused. If you know SQL Server, you will love this book. You will be able to take your existing knowledge of SQL Server and translate that knowledge into the world of cloud services from the Microsoft Azure platform, and in particular into Azure SQL. This book provides information never seen before about the history and architecture of Azure SQL. Author Bob Ward is a leading expert with access to and support from the Microsoft engineering team that built Azure SQL and related database cloud services. He presents powerful, behind-the-scenes insights into the workings of one of the most popular database cloud services in the industry. This book also brings you the latest innovations for Azure SQL including Azure Arc, Hyperscale, generative AI applications, Microsoft Copilots, and integration with the Microsoft Fabric. What You Will Learn Know the history of Azure SQL Deploy, configure, and connect to Azure SQL Choose the correct way to deploy SQL Server in Azure Migrate existing SQL Server instances to Azure SQL Monitor and tune Azure SQL’s performance to meet your needs Ensure your data and application are highly available Secure your data from attack and theft Learn the latest innovations for Azure SQL including Hyperscale Learn how to harness the power of AI for generative data-driven applications and Microsoft Copilots for assistance Learn how to integrate Azure SQL with the unified data platform, the Microsoft Fabric Who This Book Is For This book is designed to teach SQL Server in the Azure cloud to the SQL Server professional. Anyone who operates, manages, or develops applications for SQL Server will benefit from this book. Readers will be able to translate their current knowledge of SQL Server—especially of SQL Server 2019 and 2022—directly to Azure. This book is ideal for database professionals looking to remain relevant as their customer base moves into the cloud.

Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms

This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code. The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows. What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world. What You Will Learn Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure Who This Book Is For Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists

Beginning MongoDB Atlas with .NET: Flexible and Scalable Document Data Storage for .NET Developers

This book is a tutorial on MongoDB customized for developers working in Microsoft .NET 6, .NET 7, and beyond. It explains the differences between relational database systems and the document model supported by MongoDB, and shows how to build .NET applications that run against a MongoDB database, especially one in the cloud. Author Luce Carter kicks things off by teaching you how to determine when to use a document database versus a relational engine. After that, she walks you through building a Microsoft .NET project combining the MongoDB Atlas cloud database as a service solution with a .NET. application. In the process, you will learn how to create, read, update, and delete data in MongoDB from any .NET project. You will come away from this book with a solid understanding of MongoDB’s Developer Data Platform and how to use it from your .NET applications. You’ll be able to connect to MongoDB in the cloud and take advantage of the flexibility and scalability that MongoDB’s document storage model provides, and you’ll understand how to craft your applications to run using document storage and the MongoDB database engine. What You Will Learn Know when to use the MongoDB document model Build .NET applications that connect to MongoDB for data storage Create MongoDB clusters on the MongoDB Atlas cloud platform Store data in MongoDB Atlas Create, Read, Update, and Delete (CRUD) data from .NET Web API projects Test your CRUD endpoints using RESTful operations Validate schemas to help protect against breaking changes Who This Book Is For .NET developers who are looking for an alternative to relational databases, and those looking for a flexible and scalable document storage solution for use from .NET applications. Additionally, anyone wanting to learn MongoDB in the context of .NET and C# will benefit from this book.

Engineering Data Mesh in Azure Cloud

Discover how to implement a modern data mesh architecture using Microsoft Azure's Cloud Adoption Framework. In this book, you'll learn the strategies to decentralize data while maintaining strong governance, turning your current analytics struggles into scalable and streamlined processes. Unlock the potential of data mesh to achieve advanced and democratized analytics platforms. What this Book will help me do Learn to decentralize data governance and integrate data domains effectively. Master strategies for building and implementing data contracts suited to your organization's needs. Explore how to design a landing zone for a data mesh using Azure's Cloud Adoption Framework. Understand how to apply key architecture patterns for analytics, including AI and machine learning. Gain the knowledge to scale analytics frameworks using modern cloud-based platforms. Author(s) None Deswandikar is a seasoned data architect with extensive experience in implementing cutting-edge data solutions in the cloud. With a passion for simplifying complex data strategies, None brings real-world customer experiences into practical guidance. This book reflects None's dedication to helping organizations achieve their data goals with clarity and effectiveness. Who is it for? This book is ideal for chief data officers, data architects, and engineers seeking to transform data analytics frameworks to accommodate advanced workloads. Especially useful for professionals aiming to implement cloud-based data mesh solutions, it assumes familiarity with centralized data systems, data lakes, and data integration techniques. If modernizing your organization's data strategy appeals to you, this book is for you.

Azure Data Factory by Example: Practical Implementation for Data Engineers

Data engineers who need to hit the ground running will use this book to build skills in Azure Data Factory v2 (ADF). The tutorial-first approach to ADF taken in this book gets you working from the first chapter, explaining key ideas naturally as you encounter them. From creating your first data factory to building complex, metadata-driven nested pipelines, the book guides you through essential concepts in Microsoft’s cloud-based ETL/ELT platform. It introduces components indispensable for the movement and transformation of data in the cloud. Then it demonstrates the tools necessary to orchestrate, monitor, and manage those components. This edition, updated for 2024, includes the latest developments to the Azure Data Factory service: Enhancements to existing pipeline activities such as Execute Pipeline, along with the introduction of new activities such as Script, and activities designed specifically to interact with Azure Synapse Analytics. Improvements to flow control provided by activity deactivation and the Fail activity. The introduction of reusable data flow components such as user-defined functions and flowlets. Extensions to integration runtime capabilities including Managed VNet support. The ability to trigger pipelines in response to custom events. Tools for implementing boilerplate processes such as change data capture and metadata-driven data copying. What You Will Learn Create pipelines, activities, datasets, and linked services Build reusable components using variables, parameters, and expressions Move data into and around Azure services automatically Transform data natively using ADF data flows and Power Query data wrangling Master flow-of-control and triggers for tightly orchestrated pipeline execution Publish and monitor pipelines easily and with confidence Who This Book Is For Data engineers and ETL developers taking their first steps in Azure Data Factory, SQL Server Integration Services users making the transition toward doing ETL in Microsoft’s Azure cloud, and SQL Server database administrators involved in data warehousing and ETL operations

Learn T-SQL Querying - Second Edition

Troubleshoot query performance issues, identify anti-patterns in your code, and write efficient T-SQL queries with this guide for T-SQL developers Key Features A definitive guide to mastering the techniques of writing efficient T-SQL code Learn query optimization fundamentals, query analysis, and how query structure impacts performance Discover insightful solutions to detect, analyze, and tune query performance issues Purchase of the print or Kindle book includes a free PDF eBook Book Description Data professionals seeking to excel in Transact-SQL for Microsoft SQL Server and Azure SQL Database often lack comprehensive resources. Learn T-SQL Querying second edition focuses on indexing queries and crafting elegant T-SQL code enabling data professionals gain mastery in modern SQL Server versions (2022) and Azure SQL Database. The book covers new topics like logical statement processing flow, data access using indexes, and best practices for tuning T-SQL queries. Starting with query processing fundamentals, the book lays a foundation for writing performant T-SQL queries. You’ll explore the mechanics of the Query Optimizer and Query Execution Plans, learning to analyze execution plans for insights into current performance and scalability. Using dynamic management views (DMVs) and dynamic management functions (DMFs), you’ll build diagnostic queries. The book covers indexing and delves into SQL Server’s built-in tools to expedite resolution of T-SQL query performance and scalability issues. Hands-on examples will guide you to avoid UDF pitfalls and understand features like predicate SARGability, Query Store, and Query Tuning Assistant. By the end of this book, you‘ll have developed the ability to identify query performance bottlenecks, recognize anti-patterns, and avoid pitfalls What you will learn Identify opportunities to write well-formed T-SQL statements Familiarize yourself with the Cardinality Estimator for query optimization Create efficient indexes for your existing workloads Implement best practices for T-SQL querying Explore Query Execution Dynamic Management Views Utilize the latest performance optimization features in SQL Server 2017, 2019, and 2022 Safeguard query performance during upgrades to newer versions of SQL Server Who this book is for This book is for database administrators, database developers, data analysts, data scientists and T-SQL practitioners who want to master the art of writing efficient T-SQL code and troubleshooting query performance issues through practical examples. A basic understanding of T-SQL syntax, writing queries in SQL Server, and using the SQL Server Management Studio tool will be helpful to get started.

Azure Data Factory Cookbook - Second Edition

This comprehensive guide to Azure Data Factory shows you how to create robust data pipelines and workflows to handle both cloud and on-premises data solutions. Through practical recipes, you will learn to build, manage, and optimize ETL, hybrid ETL, and ELT processes. The book offers detailed explanations to help you integrate technologies like Azure Synapse, Data Lake, and Databricks into your projects. What this Book will help me do Master building and managing data pipelines using Azure Data Factory's latest versions and features. Leverage Azure Synapse and Azure Data Lake for streamlined data integration and analytics workflows. Enhance your ETL/ELT solutions with Microsoft Fabric, Databricks, and Delta tables. Employ debugging tools and workflows in Azure Data Factory to identify and solve data processing issues efficiently. Implement industry-grade best practices for reliable and efficient data orchestration and integration pipelines. Author(s) Dmitry Foshin, Tonya Chernyshova, Dmitry Anoshin, and Xenia Ireton collectively bring years of expertise in data engineering and cloud-based solutions. They are recognized professionals in the Azure ecosystem, dedicated to sharing their knowledge through detailed and actionable content. Their collaborative approach ensures that this book provides practical insights for technical audiences. Who is it for? This book is ideal for data engineers, ETL developers, and professional architects who work with cloud and hybrid environments. If you're looking to upskill in Azure Data Factory or expand your knowledge into related technologies like Synapse Analytics or Databricks, this is for you. Readers should have a foundational understanding of data warehousing concepts to fully benefit from the material.

Deciphering Data Architectures

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll: Gain a working understanding of several data architectures Learn the strengths and weaknesses of each approach Distinguish data architecture theory from reality Pick the best architecture for your use case Understand the differences between data warehouses and data lakes Learn common data architecture concepts to help you build better solutions Explore the historical evolution and characteristics of data architectures Learn essentials of running an architecture design session, team organization, and project success factors Free from product discussions, this book will serve as a timeless resource for years to come.

MCA Microsoft Certified Associate Azure Data Engineer Study Guide

Prepare for the Azure Data Engineering certification—and an exciting new career in analytics—with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you’ll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you’ll get up to speed quickly and efficiently with Sybex’s easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety Complimentary access to Sybex’s expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

Azure SQL Hyperscale Revealed: High-performance Scalable Solutions for Critical Data Workloads

Take a deep dive into the Azure SQL Database Hyperscale Service Tier and discover a new form of cloud architecture from Microsoft that supports massive databases. The new horizontally scalable architecture, formerly code-named Socrates, allows you to decouple compute nodes from storage layers. This radically different approach dramatically increases the scalability of the service. This book shows you how to leverage Hyperscale to provide next-level scalability, high throughput, and fast performance from large databases in your environment. The book begins by showing how Hyperscale helps you eliminate many of the problems of traditional high-availability and disaster recovery architecture. You’ll learn how Hyperscale overcomes storage capacity limitations and issues with scale-up times and costs. With Hyperscale, your costs do not increase linearly with database size and you can manage more data than ever at a lower cost. The book teaches you how todeploy, configure, and monitor an Azure SQL Hyperscale database in a production environment. The book also covers migrating your current workloads from traditional architecture to Azure SQL Hyperscale. What You Will Learn Understand the advantages of Hyperscale over traditional architecture Deploy a Hyperscale database on the Azure cloud (interactively and with code) Configure the advanced features of the Hyperscale database tier Monitor and scale database performance to suit your needs Back up and restore your Azure SQL Hyperscale databases Implement disaster recovery and failover capability Compare performance of Hyperscale vs traditional architecture Migrate existing databases to the Hyperscale service tier Who This Book Is For SQL architects, data engineers, and DBAs who want the most efficient and cost-effective cloud technologies to run their critical data workloads, and those seeking rapid scalability and high performance and throughput while utilizing large databases

Cyber Resiliency with Splunk Enterprise and IBM FlashSystem Storage Safeguarded Copy with IBM Copy Services Manager

The focus of this document is to highlight early threat detection by using Splunk Enterprise and proactively start a cyber resilience workflow in response to a cyberattack or malicious user action. The workflow uses IBM® Copy Services Manager (CSM) as orchestration software to invoke the IBM FlashSystem® storage Safeguarded Copy function, which creates an immutable copy of the data in an air-gapped form on the same IBM FlashSystem Storage for isolation and eventual quick recovery. This document explains the steps that are required to enable and forward IBM FlashSystem audit logs and set a Splunk forwarder configuration to forward local event logs to Splunk Enterprise. This document also describes how to create various alerts in Splunk Enterprise to determine a threat, and configure and invoke an appropriate response to the detected threat in Splunk Enterprise. This document explains the lab setup configuration steps that are involved in configuring various components like Splunk Enterprise, Splunk Enterprise config files for custom apps, IBM CSM, and IBM FlashSystem Storage. The last steps in the lab setup section demonstrate the automated Safeguarded Copy creation and validation steps. This document also describes brief steps for configuring various components and integrating them. This document demonstrates a use case for protecting a Microsoft SQL database (DB) volume that is created on IBM FlashSystem Storage. When a threat is detected on the Microsoft SQL DB volume, Safeguarded Copy starts on an IBM FlashSystem Storage volume. The Safeguarded Copy creates an immutable copy of the data, and the same data volume can be recovered or restored by using IBM CSM. This publication does not describe the installation procedures for Splunk Enterprise, Splunk Forwarder for IBM CSM, th Microsoft SQL server, or the IBM FlashSystem Storage setup. It is assumed that the reader of the book has a basic understanding of system, Windows, and DB administration; storage administration; and has access to the required software and documentation that is used in this document.

SQL Server 2022 Revealed: A Hybrid Data Platform Powered by Security, Performance, and Availability

Know how to use the new capabilities and cloud integrations in SQL Server 2022. This book covers the many innovative integrations with the Azure Cloud that make SQL Server 2022 the most cloud-connected edition ever. The book covers cutting-edge features such as the blockchain-based Ledger for creating a tamper-evident record of changes to data over time that you can rely on to be correct and reliable. You'll learn about built-in Query Intelligence capabilities to help you to upgrade with confidence that your applications will perform at least as fast after the upgrade than before. In fact, you'll probably see an increase in performance from the upgrade, with no code changes needed. Also covered are innovations such as contained availability groups and data virtualization with S3 object storage. New cloud integrations covered in this book include Microsoft Azure Purview and the use of Azure SQL for high availability and disaster recovery. The bookcovers Azure Synapse Link with its built-in capabilities to take changes and put them into Synapse automatically. Anyone building their career around SQL Server will want this book for the valuable information it provides on building SQL skills from edge to the cloud. ​ What You Will Learn Know how to use all of the new capabilities and cloud integrations in SQL Server 2022 Connect to Azure for disaster recovery, near real-time analytics, and security Leverage the Ledger to create a tamper-evident record of data changes over time Upgrade from prior releases and achieve faster and more consistent performance with no code changes Access data and storage in different and new formats, such as Parquet and S3, without moving the data and using your existing T-SQL skills Explore new application scenarios using innovations with T-SQL in areassuch as JSON and time series Who This Book Is For SQL Server professionals who want to upgrade their skills to the latest edition of SQL Server; those wishing to take advantage of new integrations with Microsoft Azure Purview (governance), Azure Synapse (analytics), and Azure SQL (HA and DR); and those in need of the increased performance and security offered by Query Intelligence and the new Ledger

Azure Data Engineering Cookbook - Second Edition

Azure Data Engineering Cookbook is your ultimate guide to mastering data engineering on Microsoft's Azure platform. Through an engaging collection of recipes, this book breaks down procedures to build sophisticated data pipelines, leveraging tools like Azure Data Factory, Data Lake, Databricks, and Synapse Analytics. What this Book will help me do Efficiently process large datasets using Azure Synapse analytics and Azure Databricks pipelines. Transform and shape data within systems by leveraging Azure Synapse data flows. Implement and manage relational databases in Azure with performance tuning and administration. Configure data pipeline solutions integrated with Power BI for insightful reporting. Monitor, optimize, and ensure lineage tracking for your data systems efficiently with Purview and Log analytics. Author(s) Nagaraj Venkatesan is an experienced cloud architect specializing in Microsoft Azure, with years of hands-on data engineering expertise. Ahmad Osama is a seasoned data professional and author's shared emphasis is on practical learning and bridging this with actionable skills effectively. Who is it for? This book is essential for data engineers seeking expertise in Azure's rich engineering capabilities. It's tailored for professionals with a foundational knowledge of cloud services, looking to achieve advanced proficiency in Azure data engineering pipelines.

Practical Database Auditing for Microsoft SQL Server and Azure SQL: Troubleshooting, Regulatory Compliance, and Governance

Know how to track changes and key events in your SQL Server databases in support of application troubleshooting, regulatory compliance, and governance. This book shows how to use key features in SQL Server ,such as SQL Server Audit and Extended Events, to track schema changes, permission changes, and changes to your data. You’ll even learn how to track queries run against specific tables in a database. Not all changes and events can be captured and tracked using SQL Server Audit and Extended Events, and the book goes beyond those features to also show what can be captured using common criteria compliance, change data capture, temporal tables, or querying the SQL Server log. You will learn how to audit just what you need to audit, and how to audit pretty much anything that happens on a SQL Server instance. This book will also help you set up cloud auditing with an emphasis on Azure SQL Database, Azure SQL Managed Instance, and AWS RDS SQL Server. You don’t need expensive, third-party auditing tools to make auditing work for you, and to demonstrate and provide value back to your business. This book will help you set up an auditing solution that works for you and your needs. It shows how to collect the audit data that you need, centralize that data for easy reporting, and generate audit reports using built-in SQL Server functionality for use by your own team, developers, and organization’s auditors. What You Will Learn Understand why auditing is important for troubleshooting, compliance, and governance Track changes and key events using SQL Server Audit and Extended Events Track SQL Server configuration changes for governance and troubleshooting Utilize change data capture and temporal tables to track data changes in SQL Server tables Centralize auditing data from all yourdatabases for easy querying and reporting Configure auditing on Azure SQL, Azure SQL Managed Instance, and AWS RDS SQL Server Who This Book Is For Database administrators who need to know what’s changing on their database servers, and those who are making the changes; database-savvy DevOps engineers and developers who are charged with troubleshooting processes and applications; developers and administrators who are responsible for generating reports in support of regulatory compliance reporting and auditing

Pro Database Migration to Azure: Data Modernization for the Enterprise

Migrate your existing, on-premises applications into the Microsoft Azure cloud platform. This book covers the best practices to plan, implement, and operationalize the migration of a database application from your organization’s data center to Microsoft’s Azure cloud platform. Data modernization and migration is a technologically complex endeavor that can also be taxing from a leadership and operational standpoint. This book covers not only the technology, but also the most important aspects of organization culture, communication, and politics that so frequently derail such projects. You will learn the most important steps to ensuring a successful migration and see battle-tested wisdom from industry veterans. From executive sponsorship, to executing the migration, to the important steps following migration, you will learn how to effectively conduct future migrations and ensure that your team and your database application delivers on the expected business value of the project. This book is unlike any other currently in the market. It takes you through the most critical business and technical considerations and workflows for moving your data and databases into the cloud, with special attention paid to those who are deploying to the Microsoft Data Platform in Azure, especially SQL Server. Although this book focuses on migrating on-premises SQL Server enterprises to hybrid or fully cloud-based Azure SQL Database and Azure SQL Managed Instances, it also cover topics involving migrating non-SQL Server database platforms such as Oracle, MySQL, and PostgreSQL applications to Microsoft Azure. What You Will Learn Plan a database migration that ensures smooth project progress, optimal performance, low operating cost, and minimal downtime Properly analyze and manage non-technical considerations, such as legal compliance, privacy, and team execution Perform athorough architectural analysis to select the best Azure services, performance tiers, and cost-containment features Avoid pitfalls and common reasons for failure relating to corporate culture, intra-office politics, and poor communications Secure the proper executive champions who can execute the business planning needed for success Apply proven criteria to determine your future-state architecture and your migration method Execute your migration using a process proven by the authors over years of successful projects Who This Book Is For IT leadership, strategic IT decision makers, project owners and managers, and enterprise and application architects. For anyone looking toward cloud migration projects as the next stage of growth in their careers. Also useful for enterprise DBAs and consultants who might be involved in such projects. Readers should have experience and be competent in designing, coding, implementing, and supporting database applications in an on-premises environment.

Learn dbatools in a Month of Lunches

If you work with SQL Server, dbatools is a lifesaver. This book will show you how to use this free and open source PowerShell module to automate just about every SQL server task you can imagine—all in just one month! In Learn dbatools in a Month of Lunches you will learn how to: Perform instance-to-instance and customized migrations Automate security audits, tempdb configuration, alerting, and reporting Schedule and monitor PowerShell tasks in SQL Server Agent Bulk-import any type of data into SQL Server Install dbatools in secure environments Written by a group of expert authors including dbatools creator Chrissy LeMaire, Learn dbatools in a Month of Lunches teaches you techniques that will make you more effective—and efficient—than you ever thought possible. In twenty-eight lunchbreak lessons, you’ll learn the most important use cases of dbatools and the favorite functions of its core developers. Stabilize and standardize your SQL server environment, and simplify your tasks by building automation, alerting, and reporting with this powerful tool. About the Technology For SQL Server DBAs, automation is the key to efficiency. Using the open-source dbatools PowerShell module, you can easily execute tasks on thousands of database servers at once—all from the command line. dbatools gives you over 500 pre-built commands, with countless new options for managing SQL Server at scale. There’s nothing else like it. About the Book Learn dbatools in a Month of Lunches teaches you how to automate SQL Server using the dbatools PowerShell module. Each 30-minute lesson introduces a new automation that will make your daily duties easier. Following the expert advice of dbatools creator Chrissy LeMaire and other top community contributors, you’ll learn to script everything from backups to disaster recovery. What's Inside Performing instance-to-instance and customized migrations Automating security audits, best practices, and standardized configurations Administering SQL Server Agent including running PowerShell scripts effectively Bulk-importing many types of data into SQL Server Executing advanced tasks and increasing efficiency for everyday administration About the Reader For DBAs, accidental DBAs, and systems engineers who manage SQL Server. About the Authors Chrissy LeMaire is a GitHub Star and the creator of dbatools. Rob Sewell is a data engineer and a passionate automator. Jess Pomfret and Cláudio Silva are data platform architects. All are Microsoft MVPs. Quotes All SQL Server professionals should learn dbatools. With its combination of knowledge transfer, anecdotes, and hands-on labs, this book is the perfect way. - From the Foreword by Anna Hoffman, Databases Product Management, Microsoft Excellent guide for dbatools with lots of practical tips! Required reading for anyone interested in dbatools. - Ruben Vandeginste, PeopleWare A must-have for any SQL server developer. - Raushan Kumar Jha, Microsoft If you want to automate all vital aspects of SQL Server, wait no more! Learn dbatools in a month, with guidance from the best minds in the business. - Ranjit Sahai, RAM Consulting

The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake

Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics, and Snowflake. You will learn to write efficient PySpark code for batch and streaming ELT jobs on Azure. And you will follow along with practical, scenario-based examples showing how to apply the capabilities of Delta Lake and Apache Spark to optimize performance, and secure, share, and manage a high volume, high velocity, and high variety of data in your lakehouse with ease. The patterns of success that you acquire from reading this book will help you hone your skills to build high-performing and scalable ACID-compliant lakehouses using flexible and cost-efficient decoupled storage and compute capabilities. Extensive coverage of Delta Lake ensures that you are aware of and can benefit from all that this new, open source storage layer can offer. In addition to the deep examples on Databricks in the book, there is coverage of alternative platforms such as Synapse Analytics and Snowflake so that you can make the right platform choice for your needs. After reading this book, you will be able to implement Delta Lake capabilities, including Schema Evolution, Change Feed, Live Tables, Sharing, and Clones to enable better business intelligence and advanced analytics on your data within the Azure Data Platform. What You Will Learn Implement the Data Lakehouse Paradigm on Microsoft’s Azure cloud platform Benefit from the new Delta Lake open-source storage layer for data lakehouses Take advantage of schema evolution, change feeds, live tables, and more Writefunctional PySpark code for data lakehouse ELT jobs Optimize Apache Spark performance through partitioning, indexing, and other tuning options Choose between alternatives such as Databricks, Synapse Analytics, and Snowflake Who This Book Is For Data, analytics, and AI professionals at all levels, including data architect and data engineer practitioners. Also for data professionals seeking patterns of success by which to remain relevant as they learn to build scalable data lakehouses for their organizations and customers who are migrating into the modern Azure Data Platform.

SAP S/4HANA Systems in Hyperscaler Clouds: Deploying SAP S/4HANA in AWS, Google Cloud, and Azure

This book helps SAP architects and SAP Basis administrators deploy and operate SAP S/4HANA systems on the most common public cloud platforms. Market-leading cloud offerings are covered, including Amazon Web Services, Microsoft Azure, and Google Cloud. You will gain an end-to-end understanding of the initial implementation of SAP S/4HANA systems on those platforms. You will learn how to move away from the big monolithic SAP ERP systems and arrive at an environment with a central SAP S/4HANA system as the digital core surrounded by cloud-native services. The book begins by introducing the core concepts of Hyperscaler cloud platforms that are relevant to SAP. You will learn about the architecture of SAP S/4HANA systems on public cloud platforms, with specific content provided for each of the major platforms. The book simplifies the deployment of SAP S/4HANA systems in public clouds by providing step-by-step instructions and helping you deal with thecomplexity of such a deployment. Content in the book is based on best practices, industry lessons learned, and architectural blueprints, helping you develop deep insights into the operations of SAP S/4HANA systems on public cloud platforms. Reading this book enables you to build and operate your own SAP S/4HANA system in the public cloud with a minimum of effort. What You Will Learn Choose the right Hyperscaler platform for your future SAP S/4HANA workloads Start deploying your first SAP S/4HANA system in the public cloud Avoid typical pitfalls during your implementation Apply and leverage cloud-native services for your SAP S/4HANA system Save costs by choosing the right architecture and build a robust architecture for your most critical SAP systems Meet your business’ criteria for availability and performance by having the right sizing in place Identify further use cases whenoperating SAP S/4HANA in the public cloud Who This Book Is For SAP architects looking for an answer on how to move SAP S/4HANA systems from on-premises into the cloud; those planning to deploy to one of the three major platforms from Amazon Web Services, Microsoft Azure, and Google Cloud Platform; and SAP Basis administrators seeking a detailed and realistic description of how to get started on a migration to the cloud and how to drive that cloud implementation to completion