D&A leaders must develop DataOps as an essential practice to redefine their data management operations. This involves establishing business value before pursuing significant data engineering initiatives, and preventing duplicated efforts undertaken by different teams in managing the common metadata, security and observability of information assets within the data platforms.
talk-data.com
Topic
DataOps
131
tagged
Activity Trend
Top Events
Master Microsoft Fabric from basics to advanced architectures with expert guidance to unify, secure, and scale analytics on real-world data platforms Key Features Build a complete data analytics platform with Microsoft Fabric Apply proven architectures, governance, and security strategies Gain real-world insights from five seasoned data experts Purchase of the print or Kindle book includes a free PDF eBook Book Description Microsoft Fabric is reshaping how organizations manage, analyze, and act on data by unifying ingestion, storage, transformation, analytics, AI, and visualization in a single platform. The Definitive Guide to Microsoft Fabric takes you from your very first workspace to building a secure, scalable, and future-proof analytics environment. You’ll learn how to unify data in OneLake, design data meshes, transform and model data, implement real-time analytics, and integrate AI capabilities. The book also covers advanced topics, such as governance, security, cost optimization, and team collaboration using DevOps and DataOps principles. Drawing on the real-world expertise of five seasoned professionals who have built and advised on platforms for startups, SMEs, and Europe’s largest enterprises, this book blends strategic insight with practical guidance. By the end of this book, you’ll have gained the knowledge and skills to design, deploy, and operate a Microsoft Fabric platform that delivers sustainable business value. What you will learn Understand Microsoft Fabric architecture and concepts Unify data storage and data governance with OneLake Ingest and transform data using multiple Fabric tools Implement real-time analytics and event processing Design effective semantic models and reports Integrate AI and machine learning into data workflows Apply governance, security, and compliance controls Optimize performance and costs at scale Who this book is for This book is for data engineers, analytics engineers, architects, and data analysts moving into platform design roles. It’s also valuable for technical leaders seeking to unify analytics in their organizations. You’ll need only a basic grasp of databases, SQL, and Python.
This practical, in-depth guide shows you how to build modern, sophisticated data processes using the Snowflake platform and DataOps.live —the only platform that enables seamless DataOps integration with Snowflake. Designed for data engineers, architects, and technical leaders, it bridges the gap between DataOps theory and real-world implementation, helping you take control of your data pipelines to deliver more efficient, automated solutions. . You’ll explore the core principles of DataOps and how they differ from traditional DevOps, while gaining a solid foundation in the tools and technologies that power modern data management—including Git, DBT, and Snowflake. Through hands-on examples and detailed walkthroughs, you’ll learn how to implement your own DataOps strategy within Snowflake and maximize the power of DataOps.live to scale and refine your DataOps processes. Whether you're just starting with DataOps or looking to refine and scale your existing strategies, this book—complete with practical code examples and starter projects—provides the knowledge and tools you need to streamline data operations, integrate DataOps into your Snowflake infrastructure, and stay ahead of the curve in the rapidly evolving world of data management. What You Will Learn Explore the fundamentals of DataOps , its differences from DevOps, and its significance in modern data management Understand Git’s role in DataOps and how to use it effectively Know why DBT is preferred for DataOps and how to apply it Set up and manage DataOps.live within the Snowflake ecosystem Apply advanced techniques to scale and evolve your DataOps strategy Who This Book Is For Snowflake practitioners—including data engineers, platform architects, and technical managers—who are ready to implement DataOps principles and streamline complex data workflows using DataOps.live.
F. Hoffmann-La Roche is the world’s leading provider of cancer treatments, biotech company, 4th largest pharmaceutical company and currently Europe’s 3rd largest company by market cap.
This session will explore Roche’s Snowflake environment, Approach to Data Mesh incl. Object tagging as mandatory Data Governance, Cortex AI incl. MCP Server, Data Observability supporting Data Mesh incl. Use Case deep dive & success stories, and Roadmap with Pharma Technical Domain.
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loopInterview IntroductionHow did you get involved in the area of data management?Can you describe what Compass is and the story behind it?context repository structurehow to keep it relevant/avoid sprawl/duplicationproviding guardrailshow does a tool like Compass help provide feedback/insights back to the data teams?preparing the data warehouse for effective introspection by the AILLM selectioncost managementcaching/materializing ad-hoc queriesWhy Slack and enterprise chat are important to b2b softwareHow AI is changing stakeholder relationshipsHow not to overpromise AI capabilities How does Compass relate to BI?How does Compass relate to Dagster and Data Infrastructure?What are the most interesting, innovative, or unexpected ways that you have seen Compass used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?When is Compass the wrong choice?What do you have planned for the future of Compass?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DagsterDagster LabsDagster PlusDagster CompassChris Bergh DataOps EpisodeRise of Medium Code blog postContext EngineeringData StewardInformation ArchitectureConway's LawTemporal durable execution frameworkThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
AstraZeneca’s future digital ambitions require unlocking the potential of data and AI to accelerate and innovate drug development. However, to enable AI @ scale you need a solid data foundation. Central to this vision is the evolution from fragmented data silos to a robust, scalable approach for building and deploying data products. In this session, we explore what constitutes a data product for AstraZeneca, and how our journey began with significant challenges: a complex data mess exacerbated by regulatory pressures, isolated working models, and traditional waterfall development methods.
We will explore the solutions that drove transformation, including a new operating model, rapid prototyping methods, and the implementation of DataOps.live on Snowflake. These changes have drastically cut the time needed to deliver a data product, reducing it from 4-6 months to just 4-6 days. This swift progress is unlocking significant business value and revolutionizing team collaboration across the organization.
Looking ahead, AstraZeneca is poised to expand these foundations by embracing knowledge graphs and scalable AI solutions, further amplifying the impact of data products on drug development, and operations.
AI is only as good as the data it runs on. Yet Gartner predicts in 2026, over 60% of AI projects will fail to deliver value - because the underlying data isn’t truly AI-ready. MIT is even more concerned! “Good enough” data simply isn’t enough.
At this World Tour launch event, DataOps.live reveal Momentum, the next generation of its DataOps automation platform designed to operationalize trusted AI at enterprise scale on Snowflake. Based on experiences from building over 9,000 Data Products to date, Momentum introduces breakthrough capabilities including AI-Ready Data Scoring to ensure data is fit for AI use cases, Data Product Lineage for end-to-end visibility, and a Data Engineering Agent that accelerates building reusable data products. Combined with automated CI/CD, continuous observability, and governance enforcement, Momentum closes the AI-readiness gap by embedding collaboration, metadata, and automation across the entire data lifecycle. Backed by Snowflake Ventures and trusted by leading enterprises, including AstraZeneca, Disney and AT&T, DataOps.live is the proven catalyst for scaling AI-ready data. In this session, you’ll unpack what AI-ready data really means, learn essential practices, discover a faster, easier, and more impactful way to make your AI initiatives succeed. Be the first to see Momentum in action - the future of AI-ready data.
Skrub is an open source package that simplifies machine-learning with dataframes by providing a variety of tools to explore, prepare and feature-engineer dataframes so they can be integrated into scikit-learn pipelines. Skrub DataOps allow to build extensive, multi-table wrangling plans, explore hyperparameter spaces, and export the resulting objects for deployment. The talk showcases various use cases where skrub can simplify the job of a data scientist from data preparation to deployment, through code examples and demonstrations.
AI is only as good as the data it runs on. Yet Gartner predicts in 2026, over 60% of AI projects will fail to deliver value - because the underlying data isn’t truly AI-ready. “Good enough” data isn’t enough.
In this exclusive BDL launch session, DataOps.live reveal Momentum, the next generation of its DataOps automation platform designed to operationalize trusted AI at enterprise scale.
Based on experiences from building over 9000 Data Products to date, Momentum introduces breakthrough capabilities including AI-Ready Data Scoring to ensure data is fit for AI use cases, Data Product Lineage for end-to-end visibility, and a Data Engineering Agent that accelerates building reusable data products. Combined with automated CI/CD, continuous observability, and governance enforcement, Momentum closes the AI-readiness gap by embedding collaboration, metadata, and automation across the entire data lifecycle.
Backed by Snowflake Ventures and trusted by leading enterprises, including AstraZeneca, Disney and AT&T, DataOps.live is the proven catalyst for scaling AI-ready data. In this session, you’ll unpack what AI-ready data really means, learn essential practices, discover a faster, easier, and more impactful way to make your AI initiatives succeed.
Be the first to see Momentum in action - the future of AI-ready data.
Get ready for a customer story that’s as bold as it is eye-opening. In this session, Eutelsat and DataOps.live pull back the curtain on what it really takes to deliver business-changing outcomes with a specific focus on the Use Cases addressed by Apache at the core. And these Use Cases are BIG – think about big, big numbers, and you still aren’t even close!
You’ll hear the inside story of how Eutelsat found itself with two “competing” cloud data platforms. What could have been an expensive headache turned out to be an advantage: Iceberg made it not only possible but cheaper and simpler to use both together, unlocking agility and cost savings that no single platform alone could provide.
The impact is already tangible. Telemetry pipelines are live and delivering massive value. Next up: interoperable Data Products seamlessly moving from Snowflake to Cloudera and vice versa, driving cross-platform innovation. And that’s just the start—Eutelsat is also positioning Iceberg as a future-proof standard for data sharing and export.
This is a story of scale, speed, and simplification—the kind of transformation only possible when a visionary team meets the right technology.
Ten years ago, I began advocating for **DataOps**, a framework designed to improve collaboration, efficiency, and agility in data management. The industry was still grappling with fragmented workflows, slow delivery cycles, and a disconnect between data teams and business needs. Fast forward to today, and the landscape has transformed, but have we truly embraced the future of leveraging data at scale? This session will reflect on the evolution of DataOps, examining what’s changed, what challenges persist, and where we're headed next.
**Key Takeaways:**
✅ The biggest wins and ongoing struggles in implementing DataOps over the last decade.
✅ Practical strategies for improving automation, governance, and data quality in modern workflows.
✅ How emerging trends like AI-driven automation and real-time analytics are reshaping the way we approach data management.
✅ Actionable insights on how data teams can stay agile and align better with business objectives.
**Why Attend?**
If you're a data professional, architect, or leader striving for operational excellence, this talk will equip you with the knowledge to future-proof your data strategies.
DataOps, and more recently, Data Products have still only been around for a relatively short period of time.
However, the collective experience of those who have been working in this area is now sufficiently large that patterns and trends have emerged, as well as a regular set of misconceptions!
In this session, Keith Belanager, DataOps.live Field CTO and multi decade practitioner, Paul Rankin, former Head of Data Platforms and Governance at Roche Diagnostics, multi decade practitioner and Data Mesh and Data Products expert, and Guy Adams, DataOps.live co-founder and author of DataOps for Dummies and Data Products for Dummies, meet to discuss the top myths and misconceptions they see and give the real facts!
Moving AI projects from pilot to production requires substantial effort for most enterprises. AI Engineering provides the foundation for enterprise delivery of AI and generative AI solutions at scale unifying DataOps, MLOps and DevOps practices. This session will highlight AI engineering best practices across these dimensions covering people, processes and technology.
The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.
This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines. The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Databricks Asset Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Databricks Asset Bundles for multiple environments with different configurations using the Databricks CLI. Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Databricks Asset Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Databricks Asset Bundles. By the end of this course, you will be equipped to automate Databricks project deployments with Databricks Asset Bundles, improving efficiency through DevOps practices. Pre-requisites: Strong knowledge of the Databricks platform, including experience with Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture, Unity Catalog, Delta Live Tables, and Workflows. In particular, knowledge of leveraging Expectations with Lakeflow Declarative Pipelines. Labs : Yes Certification Path: Databricks Certified Data Engineer Professional