Snowflake: The Definitive Guide, 2nd Edition

2027-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Joyce Kaye Avila

AI/ML Analytics Cloud Computing Data Management GenAI Iceberg Cyber Security Snowflake SQL data data-engineering

Snowflake is reshaping data management by integrating AI, analytics, and enterprise workloads into a single cloud platform. Snowflake: The Definitive Guide is a comprehensive resource for data architects, engineers, and business professionals looking to harness Snowflake's evolving capabilities, including Cortex AI, Snowpark, and Polaris Catalog for Apache Iceberg. This updated edition provides real-world strategies and hands-on activities for optimizing performance, securing data, and building AI-driven applications. With hands-on SQL examples and best practices, this book helps readers process structured and unstructured data, implement scalable architectures, and integrate Snowflake's AI tools seamlessly. Whether you're setting up accounts, managing access controls, or leveraging generative AI, this guide equips you with the expertise to maximize Snowflake's potential. Implement AI-powered workloads with Snowflake Cortex Explore Snowsight and Streamlit for no-code development Ensure security with access control and data governance Optimize storage, queries, and computing costs Design scalable data architectures for analytics and machine learning

Data Engineering for Multimodal AI

2026-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vasundra Srinivasan

AI/ML Cloud Computing Data Engineering ETL/ELT MLOps Cyber Security data data-engineering

A shift is underway in how organizations approach data infrastructure for AI-driven transformation. As multimodal AI systems and applications become increasingly sophisticated and data hungry, data systems must evolve to meet these complex demands. Data Engineering for Multimodal AI is one of the first practical guides for data engineers, machine learning engineers, and MLOps specialists looking to rapidly master the skills needed to build robust, scalable data infrastructures for multimodal AI systems and applications. You'll follow the entire lifecycle of AI-driven data engineering, from conceptualizing data architectures to implementing data pipelines optimized for multimodal learning in both cloud native and on-premises environments. And each chapter includes step-by-step guides and best practices for implementing key concepts. Design and implement cloud native data architectures optimized for multimodal AI workloads Build efficient and scalable ETL processes for preparing diverse AI training data Implement real-time data processing pipelines for multimodal AI inference Develop and manage feature stores that support multiple data modalities Apply data governance and security practices specific to multimodal AI projects Optimize data storage and retrieval for various types of multimodal ML models Integrate data versioning and lineage tracking in multimodal AI workflows Implement data-quality frameworks to ensure reliable outcomes across data types Design data pipelines that support responsible AI practices in a multimodal context

Data Engineering with Azure Databricks

2026-04-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Xenia Ireton , Tonya Chernyshova , Dmitry Foshin , Dmitry Anoshin

AI/ML Airflow Analytics Azure ADF Azure DevOps CI/CD Cloud Computing Data Engineering Data Lakehouse Databricks Delta +11 more

Master end-to-end data engineering on Azure Databricks. From data ingestion and Delta Lake to CI/CD and real-time streaming, build secure, scalable, and performant data solutions with Spark, Unity Catalog, and ML tools. Key Features Build scalable data pipelines using Apache Spark and Delta Lake Automate workflows and manage data governance with Unity Catalog Learn real-time processing and structured streaming with practical use cases Implement CI/CD, DevOps, and security for production-ready data solutions Explore Databricks-native ML, AutoML, and Generative AI integration Book Description "Data Engineering with Azure Databricks" is your essential guide to building scalable, secure, and high-performing data pipelines using the powerful Databricks platform on Azure. Designed for data engineers, architects, and developers, this book demystifies the complexities of Spark-based workloads, Delta Lake, Unity Catalog, and real-time data processing. Beginning with the foundational role of Azure Databricks in modern data engineering, you’ll explore how to set up robust environments, manage data ingestion with Auto Loader, optimize Spark performance, and orchestrate complex workflows using tools like Azure Data Factory and Airflow. The book offers deep dives into structured streaming, Delta Live Tables, and Delta Lake’s ACID features for data reliability and schema evolution. You’ll also learn how to manage security, compliance, and access controls using Unity Catalog, and gain insights into managing CI/CD pipelines with Azure DevOps and Terraform. With a special focus on machine learning and generative AI, the final chapters guide you in automating model workflows, leveraging MLflow, and fine-tuning large language models on Databricks. Whether you're building a modern data lakehouse or operationalizing analytics at scale, this book provides the tools and insights you need. What you will learn Set up a full-featured Azure Databricks environment Implement batch and streaming ingestion using Auto Loader Optimize Spark jobs with partitioning and caching Build real-time pipelines with structured streaming and DLT Manage data governance using Unity Catalog Orchestrate production workflows with jobs and ADF Apply CI/CD best practices with Azure DevOps and Git Secure data with RBAC, encryption, and compliance standards Use MLflow and Feature Store for ML pipelines Build generative AI applications in Databricks Who this book is for This book is for data engineers, solution architects, cloud professionals, and software engineers seeking to build robust and scalable data pipelines using Azure Databricks. Whether you're migrating legacy systems, implementing a modern lakehouse architecture, or optimizing data workflows for performance, this guide will help you leverage the full power of Databricks on Azure. A basic understanding of Python, Spark, and cloud infrastructure is recommended.

Streamline Data Quality, MDM, and Data Governance to Achieve More with Less

2026-03-10 · gartner-data-analytics-us-2026

talk

by Andrew White (Gartner)

AI/ML Analytics Data Quality

Organizations are charged with being more productive, and while AI is an answer to many such opportunities, organization and program structure can be far more impactful on productivity than using AI. This session will weave together data and analytics governance, MDM, and data quality into one organized initiative that will simplify complexity. Join this session to learn more.

AI for Data Governance: The Benefits and Risks of Adopting New Technologies

2026-03-10 · gartner-data-analytics-us-2026

talk

by Stephen Kennedy (Gartner)

AI/ML GenAI

Many organizations are wondering how GenAI will impact their D&A governance and management tools, and by extension, their D&A governance operations. This research provides an overview of how key D&A markets are incorporating GenAI capabilities and how these will change the nature of D&A governance.

Trust as the New Currency: A Paradigm Shift in Governance

2026-03-10 · gartner-data-analytics-us-2026

talk

by Guido De Simoni (Gartner)

AI/ML Analytics

Trust in data, strong risk management, robust trust models, and agentic AI are reshaping how enterprises handle data, analytics, compliance, and business value. This session explores the impact of trust on data governance and agentic AI.

Crossroads Debate: Data and Analytics Governance Versus AI Governance

2026-03-09 · gartner-data-analytics-us-2026

talk

by Andrew White (Gartner) , Lauren Kornutick (Gartner)

AI/ML Analytics

Data governance has traditionally encompassed analytics governance, managing most risks and value in traditional analytics. However, AI introduces new risks and considerations that D&A governance may not be equipped for. Should D&A governance evolve to govern AI or is it time for a separate discipline with a fresh mandate? This session explores conflicting accountabilities, leadership and operating models between these disciplines.

Practical Data Governance in 7 Easy Steps

2026-03-09 · gartner-data-analytics-us-2026

talk

by Nate Novosel (Gartner)

There are seven key reasons organizations struggle with effective data governance. This session examines what consistently goes wrong and offers practical solutions to help make data governance work for your organization.

5 Things That Cause Sleepless Nights for Heads of Governance

2026-03-09 · gartner-data-analytics-us-2026

talk

by Sue Waite (Gartner)

We will discuss the latest issues that heads of data governance must address and provide recommendations to manage and capitalize on emerging opportunities. By understanding key drivers and trends, leaders will be equipped to optimize their data governance strategies.

Data Contracts in Practice

2026-02-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ryan Collingwood

Data Contracts Data Quality JSON Python SQL YAML data data-engineering

In 'Data Contracts in Practice', Ryan Collingwood provides a detailed guide to managing and formalizing data responsibilities within organizations. Through practical examples and real-world use cases, you'll learn how to systematically address data quality, governance, and integration challenges using data contracts. What this Book will help me do Learn to identify and formalize expectations in data interactions, improving clarity among teams. Master implementation techniques to ensure data consistency and quality across critical business processes. Understand how to effectively document and deploy data contracts to bolster data governance. Explore solutions for proactively addressing and managing data changes and requirements. Gain real-world skills through practical examples using technologies like Python, SQL, JSON, and YAML. Author(s) Ryan Collingwood is a seasoned expert with over 20 years of experience in product management, data analysis, and software development. His holistic techno-social approach, designed to address both technical and organizational challenges, brings a unique perspective to improving data processes. Ryan's writing is informed by his extensive hands-on experience and commitment to enabling robust data ecosystems. Who is it for? This book is ideal for data engineers, software developers, and business analysts working to enhance organizational data integration. Professionals with a familiarity of system design, JSON, and YAML will find it particularly beneficial. Enterprise architects and leadership roles looking to understand data contract implementation and their business impacts will also greatly benefit. Basic understanding of Python and SQL is recommended to maximize learning.

What's on Santa's Sleigh this year?

2025-12-18 · Dec25 - CoPilot Security & What's on Santa's Sleigh this year?

talk

by Clare Edgson (Microsoft) , Laura Graham-Brown (Microsoft)

agents ai power platform

The Christmas elves have been busy packing their sleigh with shiny new features announced at Ignite this year. We will take you through the sparkling baubles of delight that already have your data users demanding more. As the users unwrap the delights of AI driven trumpets the parents in governance look on with dread. Laura and Claire will with a mix of entertaining Christmas themed demos will show off some of the "brilliant" new features including plenty of Agents and AI doing "magic" across your data. Do not fear though we will include how and where switches are to turn off, control and audit what chaos the new toys bring. Demos and Laughter, we will break things so you don't have to! There will be a prize for the best Christmas jumper worn to our session!

AWS re:Invent 2025 - Build an AI-ready data foundation (ANT304)

2025-12-05 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Cloud Computing GenAI

An unparalleled level of interest in generative AI and agentic AI is driving organizations to rethink their data strategy. While there is a need for data foundation constructs such as data pipelines, data architectures, data stores and data governance to evolve, there are business elements that need to stay constant like cost-efficiency and effectively collaborating across data estates. In this session we will cover how building your data foundation on AWS provides the tools and the building blocks to balance both needs, and empower organizations to grow their data strategy for building AI-ready applications.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Data Quality and Governance

2025-11-28 · Pydata Bradford - November MeetUp

talk

by Boyang Chen, Ph.D. (Dexcom)

Data Quality

This talk explores how modern data governance software helps organizations manage, trust, and get more value from their data. As businesses collect and analyse information from an ever-growing number of sources, maintaining data quality, consistency, and compliance has become increasingly important. Data governance tools address these challenges by providing centralized platforms for cataloging data, tracking lineage, managing policies, and enforcing standards across the enterprise. They make it easier for teams to discover reliable data, understand how it is being used, and ensure it meets both internal and regulatory requirements.

Jovita Tam: Why Data Governance Is a Way of Thinking, Not a Tool You Purchase

2025-11-24 · Straight Data Talk Listen

podcast_episode

by Jovita Tam , Yuliia Tkachova (Masthead Data) , Dumky de Wilde (MotherDuck)

AI/ML

Jovita Tam, data and AI advisor with a background in engineering, law, and finance, joined Yuliia and Dumke to challenge how organizations approach governance. Jovita argues that data governance is a way of thinking, not a tool you purchase, explaining why culture eats strategy and why most governance programs fail at the checkbox exercise. Jovita shares her approach to helping executives understand that governance should be an enabler, not an obstacle, and why treating it as purely compliance or cost center misses the point entirely. Jovita's Linkedin - https://www.linkedin.com/in/jovitatam/

The Definitive Guide to Microsoft Fabric

2025-11-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jean-Pierre Riehl , Romain Casteres (Microsoft) , Emilie BEAU (Microsoft) , Christopher Maneu (Microsoft) , Frederic Gisbert

AI/ML Analytics Data Analytics DataOps DevOps Microsoft Fabric Python Cyber Security SQL analytics-platforms data +2 more

Master Microsoft Fabric from basics to advanced architectures with expert guidance to unify, secure, and scale analytics on real-world data platforms Key Features Build a complete data analytics platform with Microsoft Fabric Apply proven architectures, governance, and security strategies Gain real-world insights from five seasoned data experts Purchase of the print or Kindle book includes a free PDF eBook Book Description Microsoft Fabric is reshaping how organizations manage, analyze, and act on data by unifying ingestion, storage, transformation, analytics, AI, and visualization in a single platform. The Definitive Guide to Microsoft Fabric takes you from your very first workspace to building a secure, scalable, and future-proof analytics environment. You’ll learn how to unify data in OneLake, design data meshes, transform and model data, implement real-time analytics, and integrate AI capabilities. The book also covers advanced topics, such as governance, security, cost optimization, and team collaboration using DevOps and DataOps principles. Drawing on the real-world expertise of five seasoned professionals who have built and advised on platforms for startups, SMEs, and Europe’s largest enterprises, this book blends strategic insight with practical guidance. By the end of this book, you’ll have gained the knowledge and skills to design, deploy, and operate a Microsoft Fabric platform that delivers sustainable business value. What you will learn Understand Microsoft Fabric architecture and concepts Unify data storage and data governance with OneLake Ingest and transform data using multiple Fabric tools Implement real-time analytics and event processing Design effective semantic models and reports Integrate AI and machine learning into data workflows Apply governance, security, and compliance controls Optimize performance and costs at scale Who this book is for This book is for data engineers, analytics engineers, architects, and data analysts moving into platform design roles. It’s also valuable for technical leaders seeking to unify analytics in their organizations. You’ll need only a basic grasp of databases, SQL, and Python.

#332 How to Build AI Your Users Can Trust with David Colwell, VP of AI & ML at Tricentis

2025-11-17 · DataFramed Listen

podcast_episode

by David Colwell (Tricentis) , Richie (DataCamp)

AI/ML LLM

The relationship between data governance and AI quality is more critical than ever. As organizations rush to implement AI solutions, many are discovering that without proper data hygiene and testing protocols, they're building on shaky foundations. How do you ensure your AI systems are making decisions based on accurate, appropriate information? What benchmarking strategies can help you measure real improvement rather than just increased output? With AI now touching everything from code generation to legal documents, the consequences of poor quality control extend far beyond simple errors—they can damage reputation, violate regulations, or even put licenses at risk. David Colwell is the Vice President of Artificial Intelligence and Machine Learning at Tricentis, a global leader in continuous testing and quality engineering. He founded the company’s AI division in 2018 with a mission to make quality assurance more effective and engaging through applied AI innovation. With over 15 years of experience in AI, software testing, and automation, David has played a key role in shaping Tricentis’ intelligent testing strategy. His team developed Vision AI, a patented computer vision–based automation capability within Tosca, and continues to pioneer work in large language model agents and AI-driven quality engineering. Before joining Tricentis, David led testing and innovation initiatives at DX Solutions and OnePath, building automation frameworks and leading teams to deliver scalable, AI-enabled testing solutions. Based in Sydney, he remains focused on advancing practical, trustworthy applications of AI in enterprise software development. In the episode, Richie and David explore AI disasters in legal settings, the balance between AI productivity and quality, the evolving role of data scientists, and the importance of benchmarks and data governance in AI development, and much more. Links Mentioned in the Show: Tricentis 2025 Quality Transformation ReportConnect with DavidCourse: Artificial Intelligence (AI) LeadershipRelated Episode: Building & Managing Human+Agent Hybrid Teams with Karen Ng, Head of Product at HubSpotRewatch RADAR AI New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Data Engineering for Beginners

2025-11-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Chisom Nwokwu

Big Data Cloud Computing Data Engineering Data Quality NoSQL Cyber Security data data-engineering

A hands-on technical and industry roadmap for aspiring data engineers In Data Engineering for Beginners, big data expert Chisom Nwokwu delivers a beginner-friendly handbook for everyone interested in the fundamentals of data engineering. Whether you're interested in starting a rewarding, new career as a data analyst, data engineer, or data scientist, or seeking to expand your skillset in an existing engineering role, Nwokwu offers the technical and industry knowledge you need to succeed. The book explains: Database fundamentals, including relational and noSQL databases Data warehouses and data lakes Data pipelines, including info about batch and stream processing Data quality dimensions Data security principles, including data encryption Data governance principles and data framework Big data and distributed systems concepts Data engineering on the cloud Essential skills and tools for data engineering interviews and jobs Data Engineering for Beginners offers an easy-to-read roadmap on a seemingly complicated and intimidating subject. It addresses the topics most likely to cause a beginning data engineer to stumble, clearly explaining key concepts in an accessible way. You'll also find: A comprehensive glossary of data engineering terms Common and practical career paths in the data engineering industry An introduction to key cloud technologies and services you may encounter early in your data engineering career Perfect for practicing and aspiring data analysts, data scientists, and data engineers, Data Engineering for Beginners is an effective and reliable starting point for learning an in-demand skill. It's a powerful resource for everyone hoping to expand their data engineering Skillset and upskill in the big data era.

The Great Data Engineering "Reset": From Pipelines to Agents and Beyond

2025-11-05 · Small Data SF 2025

talk

by Joe Reis (DeepLearning.AI)

AI/ML Analytics Data Engineering Data Management Data Modelling

For years, data engineering was a story of predictable "pipelines": move data from point A to point B. But AI just hit the reset button on our entire field. Now, we're all staring into the void, wondering what's next. While the fundamentals haven't changed, data remains challenging in the traditional areas of data governance, data management, and data modeling, which still present challenges. Everything else is up for grabs. This talk will cut through the noise and explore the future of data engineering in an AI-driven world. We'll examine how team structures will evolve, why agentic workflows and real-time systems are becoming non-negotiable, and how our focus must shift from building dashboards and analytics to architecting for automated action. The reset button has been pushed. It's time for us to invent the future of our industry.

CompTIA Data+ Study Guide, 2nd Edition

2025-11-04 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sharif Nijim , Mike Chapple

Data Quality Data Science DataViz comptia-data comptia data+ data data-science

Prepare for the CompTIA Data+ exam, as well as a new career in data science, with this effective study guide In the newly revised second edition of CompTIA Data+ Study Guide: Exam DA0-002, veteran IT professionals Mike Chapple and Sharif Nijim provide a powerful, one-stop resource for anyone planning to pursue the CompTIA Data+ certification and go on to an exciting new career in data science. The authors walk you through the info you need to succeed on the exam and in your first day at a data science-focused job. Complete with two online practice tests, this book comprehensively covers every objective tested by the updated DA0-002 exam, including databases and data acquisition, data quality, data analysis and statistics, data visualization, and data governance. You'll also find: Efficient and comprehensive content, helping you get up-to-speed as quickly as possible Bite-size chapters that break down essential topics into manageable and accessible lessons Complimentary access to Sybex' famous online learning environment, with practice questions, a complete glossary of common industry terminology, hundreds of flashcards, and more A practical and hands-on pathway to the CompTIA Data+ certification, as well as a new career in data science, the CompTIA Data+ Study Guide, Second Edition, offers the foundational knowledge, skills, and abilities you need to get started in an exciting and rewarding new career.

Unlocking the New Era of Smart Travel with Booking.com and Immuta

2025-10-30 · Snowflake World Tour Amsterdam

session

AI/ML Snowflake

Booking.com has confidently accelerated their AI vision with a solid and smart data governance foundation, built on Snowflake and Immuta. We will discuss Booking.com's AI goals, how agentic AI is reshaping the traveler experience, and the importance of scalable data governance in making it happen. We’ll also discuss how Booking.com’s data marketplace allows them to find, share and provision instant data access, allowing data owners to share data at scale to achieve a truly connected customer experience.

talk-data.com

Data Governance

Activity Trend

Top Events

Top Speakers

Snowflake: The Definitive Guide, 2nd Edition

Data Engineering for Multimodal AI

Data Engineering with Azure Databricks

Streamline Data Quality, MDM, and Data Governance to Achieve More with Less

AI for Data Governance: The Benefits and Risks of Adopting New Technologies

Trust as the New Currency: A Paradigm Shift in Governance

Crossroads Debate: Data and Analytics Governance Versus AI Governance

Practical Data Governance in 7 Easy Steps

5 Things That Cause Sleepless Nights for Heads of Governance

Data Contracts in Practice

What's on Santa's Sleigh this year?

AWS re:Invent 2025 - Build an AI-ready data foundation (ANT304)

AWSreInvent #AWSreInvent2025 #AWS

Data Quality and Governance

Jovita Tam: Why Data Governance Is a Way of Thinking, Not a Tool You Purchase

The Definitive Guide to Microsoft Fabric

#332 How to Build AI Your Users Can Trust with David Colwell, VP of AI & ML at Tricentis

Data Engineering for Beginners

The Great Data Engineering "Reset": From Pipelines to Agents and Beyond

CompTIA Data+ Study Guide, 2nd Edition

Unlocking the New Era of Smart Travel with Booking.com and Immuta