Data engineering is undergoing a fundamental shift. In this episode, I sit down with Nick Schrock, founder and CTO of Dagster, to discuss why he went from being an "AI moderate" to believing 90% of code will be written by AI. Being hands on also led to a massive pivot in Dagster’s roadmap and a new focus on managing and engineering context. We dive deep into why simply feeding data to LLMs isn't enough. Nick explains why real-time context tools (like MCPs) can become "token hogs" that lack precision and why the future belongs to "context pipelines": offline, batch-computed context that is governed, versioned, and treated like code. We also explore Compass, Dagster’s new collaborative agent that lives in Slack, bridging the gap between business stakeholders and data teams. If you’re wondering how your role as a data engineer will evolve in an agentic world, this conversation maps out the territory Dagster: dagster.io Nick Schrock on X: @schrockn
talk-data.com
Topic
Data Engineering
78
tagged
Activity Trend
Top Events
There's no shortage of technical content for data engineers, but a massive gap exists when it comes to the non-technical skills required to advance beyond a senior role. I sit down with Yordan Ivanov, Head of Data Engineering and writer of "Data Gibberish," to talk about this disconnect. We dive into his personal journey of failing as a manager the first time, learning the crucial "people" skills, and his current mission to help data engineers learn how to speak the language of business. Key areas we explore: The Senior-Level Content Gap: Yordan explains why his non-technical content on career strategy and stakeholder communication gets "terrible" engagement compared to technical posts, even though it's what's needed to advance.The Managerial Trap: Yordan's candid story about his first attempt at management, where he failed because he cared only about code and wasn't equipped for the people-centric aspects and politics of the role.The Danger of AI Over-reliance: A deep discussion on how leaning too heavily on AI can prevent the development of fundamental thinking and problem-solving skills, both in coding and in life.The Maturing Data Landscape: We reflect on the end of the "modern data stack euphoria" and what the wave of acquisitions means for innovation and the future of data tooling.AI Adoption in Europe vs. the US: A look at how AI adoption is perceived as massive and mandatory in Europe, while US census data shows surprisingly low enterprise adoption rates
In this episode, I sit down with Saket Saurabh (CEO of Nexla) to discuss the fundamental shift happening in the AI landscape. The conversation is moving beyond the race to build the biggest foundational models and towards a new battleground: context. We explore what it means to be a "model company" versus a "context company" and how this changes everything for data strategy and enterprise AI.
Join us as we cover: Model vs. Context Companies: The emerging divide between companies building models (like OpenAI) and those whose advantage lies in their unique data and integrations. The Limits of Current Models: Why we might be hitting an asymptote with the current transformer architecture for solving complex, reliable business processes. "Context Engineering": What this term really means, from RAG to stitching together tools, data, and memory to feed AI systems. The Resurgence of Knowledge Graphs: Why graph databases are becoming critical for providing deterministic, reliable information to probabilistic AI models, moving beyond simple vector similarity. AI's Impact on Tooling: How tools like Lovable and Cursor are changing workflows for prototyping and coding, and the risk of creating the "-10x engineer." The Future of Data Engineering: How the field is expanding as AI becomes the primary consumer of data, requiring a new focus on architecture, semantics, and managing complexity at scale.
In this episode, I sit down with Ole to discuss his new book, "Fundamentals of Metadata Management." We move past the simple definition of "data about data" to a more nuanced view of metadata as something that exists in two places at once , serving as a pointer to find information elsewhere. Ole introduces his core concept of the "MetaGrid"—the interconnected, yet siloed, web of metadata repositories that already exists within every large organization across various teams and technologies. He argues that the key to better metadata management is not to build a new monolithic system but to recognize, document, and integrate the MetaGrid that's already there, hiding in plain sight. The conversation also covers the impact of the AI hype cycle , the lessons learned from the Data Mesh movement , the sociological incentives that help or hinder metadata projects , and the cultural clash between the worlds of data engineering and library science.
Matt Housley joins me to chat about whether it matters that AI is PhD level, clanker content (the new term for AI slop), a retrospective on Fundamentals of Data Engineering, and much more.
Peter Hanssens is an Australia-based data engineer, business owner, and community pillar. He runs Cloud Shuttle, a data engineering consultancy and organizes DataEngBytes, a series of meetups and conferences throughout Australia and New Zealand.
We chat about building data engineering communities, running conferences, and much more.
There have been lots of social media posts declaring things to be dead - SQL, R, data engineering, BI, etc.
I give my thoughts on these proclamations, why it's a wrong way to think about our space, and more.
Matthew Scullion (CEO, Co-Founder of Matillion) joins me to chat about the future of data engineering, namely agentic data engineering teams.
What does this new world look like? Matthew shares some ideas of what he's building at Matillion, and the broader context of what agentic AI means for the data ecosystem, teams, and workflows.
Some people speculate that AI will make software and data engineers obsolete. If the only thing engineers do is write code, sure.
But we do a lot more than that, and I believe we'll actually need more engineers, not fewer.
In this episode, I discuss how I think AI will change the craft of software and data engineering. Spoiler - I think it will make it way more fun and productive.
Thanks to dbt and GoodData for sponsoring this episode. Please support them, as they're awesome.
dbt Launch Showcase Join dbt Labs May 28 for the dbt Launch Showcase to hear from executives and product leaders about the latest features landing in dbt. See firsthand how features will empower data practitioners and organizations in the age of AI.
GoodData Webinar Analytics and data engineering used to live in separate worlds—different teams, different tools, different goals. But the lines are blurring fast. As modern data products demand speed, scale, and seamless integration, the best teams are embracing engineering principles and best practices. In this no-BS conversation, Ryan Dolley, Matt Housley, and Joe Reis, dive into how engineering principles are transforming the way analytics is built, delivered, and scaled. 📆 May 27, 2025🕘 9:00 AM PDT, 12:00 PM EDT, 6:00 PM CEST🔗 Register here!
Juhani Vanhatapio lives in Finland at the Arctic Circle and is studying data engineering and machine learning. Juhani shares stories about life in the Arctic, the challenges and fun of guiding tours to see the Northern Lights, his AI Assistant for Northern Lights tourism, and his journey into the tech field during the COVID pandemic. This is definitely a very interesting and left-field conversation you'll enjoy.
People often ask me what I'd change in Fundamentals of Data Engineering. Usually, I reply "not much", as the the Data Engineering Lifecycle still remains intact. However, I see the role of data engineers shifting both left and right. What does this mean? Have a listen.
Willis Nana and I chat about the challenges of data engineering leadership, foundational skills, and his journey to a content creator on YouTube.#dataengineering #data #ai #datateam #leadership
Simon Späti and I discuss various aspects of writing, data engineering, and the impact of AI on the writing process. Simon shares his journey from business intelligence to data engineering and his current focus on writing. We also discuss the future of writing in the age of AI. Enjoy!
It's 2025! We made it! ;)
In this podcast, I rant about why data modeling matters more than ever, AI, and why humans will seek out "human" things in 2025 and beyond.
❤️ Your support means a lot. Please like and rate this podcast on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
It's December 31, 2024. Gordon Wong and I wrap up 2024 and chat about what we're excited about in 2025 in data and otherwise.
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
Matt Housley and I have a LONG chat about working in consulting, leaving your job, AI, the job market, our thoughts on what's coming in 2025, and much more.
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
This morning, a great article came across my feed that gave me PTSD, asking if Iceberg is the Hadoop of the Modern Data Stack?
In this rant, I bring the discussion back to a central question you should ask with any hot technology - do you need it at all? Do you need a tool built for the top 1% of companies at a sufficient data scale? Or is a spreadsheet good enough?
Link: https://blog.det.life/apache-iceberg-the-hadoop-of-the-modern-data-stack-c83f63a4ebb9
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
People often ask me for career advice. In a tough job market where people are sending out thousands of resumes and hearing nothing back, I notice a lot of people have weak networks and are unknown to the companies they're applying to. This results in lots of frustration and disappointment for job seekers.
Is there a better way? Yes. People need to know who you are. Obscurity is your enemy.
Also, the name of the Friday show changed because I can't seem to keep things to five minutes ;)
My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
Let's do things the right way, not just the fast way.
My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
I speak at a lot of conferences, and I've lost track of how many questions I've answered. Since conferences are top of mind for me right now, here are some tips for asking good (and bad) questions of speakers.
My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/