talk-data.com talk-data.com

Topic

DWH

Data Warehouse

analytics business_intelligence data_storage

143

tagged

Activity Trend

35 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Google Cloud Certified Professional Data Engineer Certification Guide

A guide to pass the GCP Professional Data Engineer exam on your first attempt and upgrade your data engineering skills on GCP. Key Features Fully understand the certification exam content and exam objectives Consolidate your knowledge of all essential exam topics and key concepts Get realistic experience of answering exam-style questions Develop practical skills for everyday use Purchase of this book unlocks access to web-based exam prep resources including mock exams, flashcards, exam tips Book Description The GCP Professional Data Engineer certification validates the fundamental knowledge required to perform data engineering tasks and use GCP services to enhance data engineering processes and further your career in the data engineering/architecting field. This book is a best-in-class study guide that fully covers the GCP Professional Data Engineer exam objectives and helps you pass the exam first time. Complete with clear explanations, chapter review questions, realistic mock exams, and pragmatic solutions, this guide will help you master the core exam concepts and build the understanding you need to go into the exam with the skills and confidence to get the best result you can. With the help of relevant examples, you'll learn fundamental data engineering concepts such as data warehousing and data security. As you progress, you'll delve into the important domains of the exam, including data pipelining, data migration, and data processing. Unlike other study guides, this book contains logical reasoning behind the choice of correct answers based in scenarios and provide you with excellent tips regarding the optimal use of each service, and gives you everything you need to pass the exam and enhance your prospects in the data engineering field. What you will learn Create data solutions and pipelines in GCP Analyze and transform data into useful information Apply data engineering concepts to real scenarios Create secure, cost-effective, valuable GCP workloads Work in the GCP environment with industry best practices Who this book is for This book is for data engineers who want a reliable source for the key concepts and terms present in the most prestigious and highly-sought-after cloud-based data engineering certification. This book will help you improve your data engineering in GCP skills to give you a better chance at earning the GCP Professional Data Engineer Certification. You will already be familiar with the Google Cloud Platform, having either explored it (professionally or personally) for at least a year. You should also have some familiarity with basic data concepts (such as types of data and basic SQL knowledge).

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

Master the art of data transformation with the second edition of this trusted guide to dbt. Building on the foundation of the first edition, this updated volume offers a deeper, more comprehensive exploration of dbt’s capabilities—whether you're new to the tool or looking to sharpen your skills. It dives into the latest features and techniques, equipping you with the tools to create scalable, maintainable, and production-ready data transformation pipelines. Unlocking dbt, Second Edition introduces key advancements, including the semantic layer, which allows you to define and manage metrics at scale, and dbt Mesh, empowering organizations to orchestrate decentralized data workflows with confidence. You’ll also explore more advanced testing capabilities, expanded CI/CD and deployment strategies, and enhancements in documentation—such as the newly introduced dbt Catalog. As in the first edition, you’ll learn how to harness dbt’s power to transform raw data into actionable insights, while incorporating software engineering best practices like code reusability, version control, and automated testing. From configuring projects with the dbt Platform or open source dbt to mastering advanced transformations using SQL and Jinja, this book provides everything you need to tackle real-world challenges effectively. What You Will Learn Understand dbt and its role in the modern data stack Set up projects using both the cloud-hosted dbt Platform and open source project Connect dbt projects to cloud data warehouses Build scalable models in SQL and Python Configure development, testing, and production environments Capture reusable logic with Jinja macros Incorporate version control with your data transformation code Seamlessly connect your projects using dbt Mesh Build and manage a semantic layer using dbt Deploy dbt using CI/CD best practices Who This Book Is For Current and aspiring data professionals, including architects, developers, analysts, engineers, data scientists, and consultants who are beginning the journey of using dbt as part of their data pipeline’s transformation layer. Readers should have a foundational knowledge of writing basic SQL statements, development best practices, and working with data in an analytical context such as a data warehouse.

Jumpstart Snowflake: A Step-by-Step Guide to Modern Cloud Analytics

This book is your guide to the modern market of data analytics platforms and the benefits of using Snowflake, the data warehouse built for the cloud. As organizations increasingly rely on modern cloud data platforms, the core of any analytics framework—the data warehouse—is more important than ever. This updated 2nd edition ensures you are ready to make the most of the industry’s leading data warehouse. This book will onboard you to Snowflake and present best practices for deploying and using the Snowflake data warehouse. The book also covers modern analytics architecture, integration with leading analytics software such as Matillion ETL, Tableau, and Databricks, and migration scenarios for on-premises legacy data warehouses. This new edition includes expanded coverage of SnowPark for developing complex data applications, an introduction to managing large datasets with Apache Iceberg tables, and instructions for creating interactive data applications using Streamlit, ensuring readers are equipped with the latest advancements in Snowflake's capabilities. What You Will Learn Master key functionalities of Snowflake Set up security and access with cluster Bulk load data into Snowflake using the COPY command Migrate from a legacy data warehouse to Snowflake Integrate the Snowflake data platform with modern business intelligence (BI) and data integration tools Manage large datasets with Apache Iceberg Tables Implement continuous data loading with Snowpipe and Dynamic Tables Who This Book Is For Data professionals, business analysts, IT administrators, and existing or potential Snowflake users

Amazon Redshift Cookbook - Second Edition

Amazon Redshift Cookbook provides practical techniques for utilizing AWS's managed data warehousing service effectively. With this book, you'll learn to create scalable and secure data analytics solutions, tackle data integration challenges, and leverage Redshift's advanced features like data sharing and generative AI capabilities. What this Book will help me do Create end-to-end data analytics solutions from ingestion to reporting using Amazon Redshift. Optimize the performance and security of Redshift implementations to meet enterprise standards. Leverage Amazon Redshift for zero-ETL ingestion and advanced concurrency scaling. Integrate Redshift with data lakes for enhanced data processing versatility. Implement generative AI and machine learning solutions directly within Redshift environments. Author(s) Shruti Worlikar, Harshida Patel, and Anusha Challa are seasoned data experts who bring together years of experience with Amazon Web Services and data analytics. Their combined expertise enables them to offer actionable insights, hands-on recipes, and proven strategies for implementing and optimizing Amazon Redshift-based solutions. Who is it for? This book is best suited for data analysts, data engineers, and architects who are keen on mastering modern data warehouse solutions using Redshift. Readers should have some knowledge of data warehousing and familiarity with cloud concepts. Ideal for professionals looking to migrate on-premises systems or build cloud-native analytics pipelines leveraging Redshift.

Accelerating Data Pipeline Development

Today's data engineering teams are overwhelmed—juggling fire drills and endless requests while relying on manual, repetitive processes for building data pipelines. This much-needed tech guide from author Josh Hall introduces a practical approach to streamlining pipeline development, empowering teams to work smarter, not harder. Using Coalesce, a modern development platform, you'll learn to standardize workflows, apply reusable design patterns, and build faster, more efficient pipelines—all without piling on tech debt. Ideal for data engineers, architects, and analysts of all experience levels, the book offers clear explanations of Coalesce's core functionality including configuring environments, defining nodes, and connecting to data warehouses. Packed with workflows and useful takeaways, it's your guide to delivering high-quality, actionable data while reducing pipeline development time. Set up Coalesce and integrate with a data warehouse Use reusable nodes and design patterns for faster development Accelerate pipeline delivery with reduced manual effort Leverage Coalesce Marketplace for advanced functionality

Database Management Systems by Pearson

Express Learning is a series of books designed as quick reference guides to important undergraduate computer courses. The organized and accessible format of these books allows students to learn important concepts in an easy-to-understand, question-and-answer format. These portable learning tools have been designed as one-stop references for students to understand and master the subjects by themselves.

Features –

• Designed as a student-friendly self-learning guide. The book is written in a clear, concise, and lucid manner. • Easy-to-understand question-and-answer format. • Includes previously asked as well as new questions organized in chapters. • All types of questions including MCQs, short and long questions are covered. • Solutions to numerical questions asked at examinations are provided. • All ideas and concepts are presented with clear examples. • Text is well structured and well supported with suitable diagrams. • Inter-chapter dependencies are kept to a minimum

Book Contents –

1: Database System 2: Conceptual Modelling 3: Relational Model 4: Relational Algebra and Calculus 5: Structured Query Language 6: Relational Database Design 7: Data Storage and Indexing 8: Query Processing and Optimization 9: Introduction to Transaction Processing 10: Concurrency Control Techniques 11: Database Recovery System 12: Database Security 13: Database System Architecture 14: Data Warehousing, OLAP, and Data Mining 15: Information Retrieval 16: Miscellaneous Questions

Express Learning - Data Warehousing and Data Mining, 1st Edition by Pearson

Express Learning is a series of books designed as quick reference guides to important undergraduate courses. The organized and accessible format of these books allows students to learn important concepts in an easy-to-understand, question-and-answer format. These portable learning tools have been designed as one-stop references for students to understand and master the subjects by themselves.

Book Contents –

Chapter 1: Introduction to Data Warehouse Chapter 2: Building a Data Warehouse Chapter 3: Data Warehouse: Architecture Chapter 4: OLAP Technology Chapter 5: Introduction to Data Mining Chapter 6: Data Preprocessing Chapter 7: Mining Association Rules Chapter 8: Classification and Prediction Chapter 9: Cluster Analysis Chapter 10: Advanced Techniques of Data Mining and Its Applications Index

Azure Data Factory by Example: Practical Implementation for Data Engineers

Data engineers who need to hit the ground running will use this book to build skills in Azure Data Factory v2 (ADF). The tutorial-first approach to ADF taken in this book gets you working from the first chapter, explaining key ideas naturally as you encounter them. From creating your first data factory to building complex, metadata-driven nested pipelines, the book guides you through essential concepts in Microsoft’s cloud-based ETL/ELT platform. It introduces components indispensable for the movement and transformation of data in the cloud. Then it demonstrates the tools necessary to orchestrate, monitor, and manage those components. This edition, updated for 2024, includes the latest developments to the Azure Data Factory service: Enhancements to existing pipeline activities such as Execute Pipeline, along with the introduction of new activities such as Script, and activities designed specifically to interact with Azure Synapse Analytics. Improvements to flow control provided by activity deactivation and the Fail activity. The introduction of reusable data flow components such as user-defined functions and flowlets. Extensions to integration runtime capabilities including Managed VNet support. The ability to trigger pipelines in response to custom events. Tools for implementing boilerplate processes such as change data capture and metadata-driven data copying. What You Will Learn Create pipelines, activities, datasets, and linked services Build reusable components using variables, parameters, and expressions Move data into and around Azure services automatically Transform data natively using ADF data flows and Power Query data wrangling Master flow-of-control and triggers for tightly orchestrated pipeline execution Publish and monitor pipelines easily and with confidence Who This Book Is For Data engineers and ETL developers taking their first steps in Azure Data Factory, SQL Server Integration Services users making the transition toward doing ETL in Microsoft’s Azure cloud, and SQL Server database administrators involved in data warehousing and ETL operations

Azure Data Factory Cookbook - Second Edition

This comprehensive guide to Azure Data Factory shows you how to create robust data pipelines and workflows to handle both cloud and on-premises data solutions. Through practical recipes, you will learn to build, manage, and optimize ETL, hybrid ETL, and ELT processes. The book offers detailed explanations to help you integrate technologies like Azure Synapse, Data Lake, and Databricks into your projects. What this Book will help me do Master building and managing data pipelines using Azure Data Factory's latest versions and features. Leverage Azure Synapse and Azure Data Lake for streamlined data integration and analytics workflows. Enhance your ETL/ELT solutions with Microsoft Fabric, Databricks, and Delta tables. Employ debugging tools and workflows in Azure Data Factory to identify and solve data processing issues efficiently. Implement industry-grade best practices for reliable and efficient data orchestration and integration pipelines. Author(s) Dmitry Foshin, Tonya Chernyshova, Dmitry Anoshin, and Xenia Ireton collectively bring years of expertise in data engineering and cloud-based solutions. They are recognized professionals in the Azure ecosystem, dedicated to sharing their knowledge through detailed and actionable content. Their collaborative approach ensures that this book provides practical insights for technical audiences. Who is it for? This book is ideal for data engineers, ETL developers, and professional architects who work with cloud and hybrid environments. If you're looking to upskill in Azure Data Factory or expand your knowledge into related technologies like Synapse Analytics or Databricks, this is for you. Readers should have a foundational understanding of data warehousing concepts to fully benefit from the material.

Deciphering Data Architectures

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll: Gain a working understanding of several data architectures Learn the strengths and weaknesses of each approach Distinguish data architecture theory from reality Pick the best architecture for your use case Understand the differences between data warehouses and data lakes Learn common data architecture concepts to help you build better solutions Explore the historical evolution and characteristics of data architectures Learn essentials of running an architecture design session, team organization, and project success factors Free from product discussions, this book will serve as a timeless resource for years to come.

Architecting a Modern Data Warehouse for Large Enterprises: Build Multi-cloud Modern Distributed Data Warehouses with Azure and AWS

Design and architect new generation cloud-based data warehouses using Azure and AWS. This book provides an in-depth understanding of how to build modern cloud-native data warehouses, as well as their history and evolution. The book starts by covering foundational data warehouse concepts, and introduces modern features such as distributed processing, big data storage, data streaming, and processing data on the cloud. You will gain an understanding of the synergy, relevance, and usage data warehousing standard practices in the modern world of distributed data processing. The authors walk you through the essential concepts of Data Mesh, Data Lake, Lakehouse, and Delta Lake. And they demonstrate the services and offerings available on Azure and AWS that deal with data orchestration, data democratization, data governance, data security, and business intelligence. After completing this book, you will be ready to design and architect enterprise-grade, cloud-based modern data warehouses using industry best practices and guidelines. What You Will Learn Understand the core concepts underlying modern data warehouses Design and build cloud-native data warehousesGain a practical approach to architecting and building data warehouses on Azure and AWS Implement modern data warehousing components such as Data Mesh, Data Lake, Delta Lake, and Lakehouse Process data through pandas and evaluate your model’s performance using metrics such as F1-score, precision, and recall Apply deep learning to supervised, semi-supervised, and unsupervised anomaly detection tasks for tabular datasets and time series applications Who This Book Is For Experienced developers, cloud architects, and technology enthusiasts looking to build cloud-based modern data warehouses using Azure and AWS

Data Exploration and Preparation with BigQuery

In "Data Exploration and Preparation with BigQuery," Michael Kahn provides a hands-on guide to understanding and utilizing Google's powerful data warehouse solution, BigQuery. This comprehensive book equips you with the skills needed to clean, transform, and analyze large datasets for actionable business insights. What this Book will help me do Master the process of exploring and assessing the quality of datasets. Learn SQL for performing efficient and advanced data transformations in BigQuery. Optimize the performance of BigQuery queries for speed and cost-effectiveness. Discover best practices for setting up and managing BigQuery resources. Apply real-world case studies to analyze data and derive meaningful insights. Author(s) Michael Kahn is an experienced data engineer and author specializing in big data solutions and technologies. With years of hands-on experience working with Google Cloud Platform and BigQuery, he has assisted organizations in optimizing their data pipelines for effective decision-making. His accessible writing style ensures complex topics become approachable, enabling readers of various skill levels to succeed. Who is it for? This book is tailored for data analysts, data engineers, and data scientists who want to learn how to effectively use BigQuery for data exploration and preparation. Whether you're new to BigQuery or looking to deepen your expertise in working with large datasets, this book provides clear guidance and practical examples to achieve your goals.

Amazon Redshift: The Definitive Guide

Amazon Redshift powers analytic cloud data warehouses worldwide, from startups to some of the largest enterprise data warehouses available today. This practical guide thoroughly examines this managed service and demonstrates how you can use it to extract value from your data immediately, rather than go through the heavy lifting required to run a typical data warehouse. Analytic specialists Rajesh Francis, Rajiv Gupta, and Milind Oke detail Amazon Redshift's underlying mechanisms and options to help you explore out-of-the box automation. Whether you're a data engineer who wants to learn the art of the possible or a DBA looking to take advantage of machine learning-based auto-tuning, this book helps you get the most value from Amazon Redshift. By understanding Amazon Redshift features, you'll achieve excellent analytic performance at the best price, with the least effort. This book helps you: Build a cloud data strategy around Amazon Redshift as foundational data warehouse Get started with Amazon Redshift with simple-to-use data models and design best practices Understand how and when to use Redshift Serverless and Redshift provisioned clusters Take advantage of auto-tuning options inherent in Amazon Redshift and understand manual tuning options Transform your data platform for predictive analytics using Redshift ML and break silos using data sharing Learn best practices for security, monitoring, resilience, and disaster recovery Leverage Amazon Redshift integration with other AWS services to unlock additional value

Learning and Operating Presto

The Presto community has mushroomed since its origins at Facebook in 2012. But ramping up this open source distributed SQL query engine can be challenging even for the most experienced engineers. With this practical book, data engineers and architects, platform engineers, cloud engineers, and software engineers will learn how to use Presto operations at your organization to derive insights on datasets wherever they reside. Authors Angelica Lo Duca, Tim Meehan, Vivek Bharathan, and Ying Su explain what Presto is, where it came from, and how it differs from other data warehousing solutions. You'll discover why Facebook, Uber, Alibaba Cloud, Hewlett Packard Enterprise, IBM, Intel, and many more use Presto and how you can quickly deploy Presto in production. With this book, you will: Learn how to install and configure Presto Use Presto with business intelligence tools Understand how to connect Presto to a variety of data sources Extend Presto for real-time business insight Learn how to apply best practices and tuning Get troubleshooting tips for logs, error messages, and more Explore Presto's architectural concepts and usage patterns Understand Presto security and administration

Snowflake SnowPro™ Advanced Architect Certification Companion: Hands-on Preparation and Practice

Master the intricacies of Snowflake and prepare for the SnowPro Advanced Architect Certification exam with this comprehensive study companion. This book provides robust and effective study tools to help you prepare for the exam and is also designed for those who are interested in learning the advanced features of Snowflake. The practical examples and in-depth background on theory in this book help you unleash the power of Snowflake in building a high-performance system. The best practices demonstrated in the book help you use Snowflake more powerfully and effectively as a data warehousing and analytics platform. Reading this book and reviewing the concepts will help you gain the knowledge you need to take the exam. The book guides you through a study of the different domains covered on the exam: Accounts and Security, Snowflake Architecture, Data Engineering, and Performance Optimization. You’ll also be well positioned to apply your newly acquired practical skills to real-world Snowflake solutions. You will have a deep understanding of Snowflake to help you take full advantage of Snowflake’s architecture to deliver value analytics insight to your business. What You Will Learn Gain the knowledge you need to prepare for the exam Review in-depth theory on Snowflake to help you build high-performance systems Broaden your skills as a data warehouse designer to cover the Snowflake ecosystem Optimize performance and costs associated with your use of the Snowflake data platform Share data securely both inside your organization and with external partners Apply your practical skills to real-world Snowflake solutions Who This Book Is For Anyone who is planning to take the SnowPro Advanced Architect Certification exam, those who want to move beyond traditional database technologies and build their skills to design and architect solutions using Snowflake services, and veteran database professionals seeking an on-the-job reference to understand one of the newest and fastest-growing technologies in data

Snowflake SnowPro Core Certification Study Guide

Prepare smarter, faster, and better with the premier study guide for Snowflake SnowPro Core certification Snowflake, a cloud-based data warehousing platform, has steadily gained popularity since its 2014 launch. Snowflake offers several certification exams, of which the SnowPro Core certification is the foundational exam. The SnowPro Core Certification validates an individual's grasp of Snowflake as a cloud data warehouse, its architectural fundamentals, and the ability to design, implement, and maintain secure, scalable Snowflake systems. The Snowflake SnowPro Core Certification Study Guide delivers comprehensive coverage of every relevant exam topic on the Snowflake SnowPro Core Certification test. Prepare efficiently and effectively for the exam with online practice tests and flashcards, a digital glossary, and concise and easy-to-follow instruction from the subject-matter experts at Sybex. You'll gain the necessary knowledge to help you succeed in the exam and will be able to apply the acquired practical skills to real-world Snowflake solutions. This Study Guide includes: Comprehensive understanding of Snowflake's unique shared data, multi-cluster architecture Guidance on loading structured and semi-structured data into Snowflake Utilizing data sharing, cloning, and time travel features Managing performance through clustering keys, scaling compute up, down & across Steps to account management and security configuration including RBAC & MFA All the info you need to obtain a highly valued credential for a rapidly growing new database software solution Access to the Sybex online learning center, with chapter review questions, full-length practice exams, hundreds of electronic flashcards, and a glossary of key terms Perfect for anyone considering a new career in cloud-based data warehouse solutions and related fields, Snowflake SnowPro Core Certification Study Guide is also a must-read for veteran database professionals seeking an understanding of one of the newest and fastest-growing niches in data.

Building the Snowflake Data Cloud: Monetizing and Democratizing Your Data

Implement the Snowflake Data Cloud using best practices and reap the benefits of scalability and low-cost from the industry-leading, cloud-based, data warehousing platform. This book provides a detailed how-to explanation, and assumes familiarity with Snowflake core concepts and principles. It is a project-oriented book with a hands-on approach to designing, developing, and implementing your Data Cloud with security at the center. As you work through the examples, you will develop the skill, knowledge, and expertise to expand your capability by incorporating additional Snowflake features, tools, and techniques. Your Snowflake Data Cloud will be fit for purpose, extensible, and at the forefront of both Direct Share, Data Exchange, and Snowflake Marketplace. Building the Snowflake Data Cloud helps you transform your organization into monetizing the value locked up within your data. As the digital economy takes hold, with data volume, velocity, and variety growing at exponential rates, you need tools and techniques to quickly categorize, collate, summarize, and aggregate data. You also need the means to seamlessly distribute to release value. This book shows how Snowflake provides all these things and how to use them to your advantage. The book helps you succeed by delivering faster than you can deliver with legacy products and techniques. You will learn how to leverage what you already know, and what you don’t, all applied in a Snowflake Data Cloud context. After reading this book, you will discover and embrace the future where the Data Cloud is central. You will be able to position your organization to take advantage by identifying, adopting, and preparing your tooling for the coming wave of opportunity around sharing and monetizing valuable, corporate data. What You Will Learn Understand why Data Cloud is important tothe success of your organization Up-skill and adopt Snowflake, leveraging the benefits of cloud platforms Articulate the Snowflake Marketplace and identify opportunities to monetize data Identify tools and techniques to accelerate integration with Data Cloud Manage data consumption by monitoring and controlling access to datasets Develop data load and transform capabilities for use in future projects Who This Book Is For Solution architects seeking implementation patterns to integrate with a Data Cloud; data warehouse developers looking for tips, tools, and techniques to rapidly deliver data pipelines; sales managers who want to monetize their datasets and understand the opportunities that Data Cloud presents; and anyone who wishes to unlock value contained within their data silos

Snowflake: The Definitive Guide

Snowflake's ability to eliminate data silos and run workloads from a single platform creates opportunities to democratize data analytics, allowing users at all levels within an organization to make data-driven decisions. Whether you're an IT professional working in data warehousing or data science, a business analyst or technical manager, or an aspiring data professional wanting to get more hands-on experience with the Snowflake platform, this book is for you. You'll learn how Snowflake users can build modern integrated data applications and develop new revenue streams based on data. Using hands-on SQL examples, you'll also discover how the Snowflake Data Cloud helps you accelerate data science by avoiding replatforming or migrating data unnecessarily. You'll be able to: Efficiently capture, store, and process large amounts of data at an amazing speed Ingest and transform real-time data feeds in both structured and semistructured formats and deliver meaningful data insights within minutes Use Snowflake Time Travel and zero-copy cloning to produce a sensible data recovery strategy that balances system resilience with ongoing storage costs Securely share data and reduce or eliminate data integration costs by accessing ready-to-query datasets available in the Snowflake Marketplace

Unlock Complex and Streaming Data with Declarative Data Pipelines

Unlocking the value of modern data is critical for data-driven companies. This report provides a concise, practical guide to building a data architecture that efficiently delivers big, complex, and streaming data to both internal users and customers. Authors Ori Rafael, Roy Hasson, and Rick Bilodeau from Upsolver examine how modern data pipelines can improve business outcomes. Tech leaders and data engineers will explore the role these pipelines play in the data architecture and learn how to intelligently consider tradeoffs between different data architecture patterns and data pipeline development approaches. You will: Examine how recent changes in data, data management systems, and data consumption patterns have made data pipelines challenging to engineer Learn how three data architecture patterns (event sourcing, stateful streaming, and declarative data pipelines) can help you upgrade your practices to address modern data Compare five approaches for building modern data pipelines, including pure data replication, ELT over a data warehouse, Apache Spark over data lakes, declarative pipelines over data lakes, and declarative data lake staging to a data warehouse

Snowflake Access Control: Mastering the Features for Data Privacy and Regulatory Compliance

Understand the different access control paradigms available in the Snowflake Data Cloud and learn how to implement access control in support of data privacy and compliance with regulations such as GDPR, APPI, CCPA, and SOX. The information in this book will help you and your organization adhere to privacy requirements that are important to consumers and becoming codified in the law. You will learn to protect your valuable data from those who should not see it while making it accessible to the analysts whom you trust to mine the data and create business value for your organization. Snowflake is increasingly the choice for companies looking to move to a data warehousing solution, and security is an increasing concern due to recent high-profile attacks. This book shows how to use Snowflake's wide range of features that support access control, making it easier to protect data access from the data origination point all the way to the presentation and visualization layer.Reading this book helps you embrace the benefits of securing data and provide valuable support for data analysis while also protecting the rights and privacy of the consumers and customers with whom you do business. What You Will Learn Identify data that is sensitive and should be restricted Implement access control in the Snowflake Data Cloud Choose the right access control paradigm for your organization Comply with CCPA, GDPR, SOX, APPI, and similar privacy regulations Take advantage of recognized best practices for role-based access control Prevent upstream and downstream services from subverting your access control Benefit from access control features unique to the Snowflake Data Cloud Who This Book Is For Data engineers, database administrators, and engineering managers who wantto improve their access control model; those whose access control model is not meeting privacy and regulatory requirements; those new to Snowflake who want to benefit from access control features that are unique to the platform; technology leaders in organizations that have just gone public and are now required to conform to SOX reporting requirements