Data engineering is undergoing a fundamental shift. In this episode, I sit down with Nick Schrock, founder and CTO of Dagster, to discuss why he went from being an "AI moderate" to believing 90% of code will be written by AI. Being hands on also led to a massive pivot in Dagster’s roadmap and a new focus on managing and engineering context. We dive deep into why simply feeding data to LLMs isn't enough. Nick explains why real-time context tools (like MCPs) can become "token hogs" that lack precision and why the future belongs to "context pipelines": offline, batch-computed context that is governed, versioned, and treated like code. We also explore Compass, Dagster’s new collaborative agent that lives in Slack, bridging the gap between business stakeholders and data teams. If you’re wondering how your role as a data engineer will evolve in an agentic world, this conversation maps out the territory Dagster: dagster.io Nick Schrock on X: @schrockn
talk-data.com
Topic
Data Engineering
86
tagged
Activity Trend
Top Events
For years, data engineering was a story of predictable "pipelines": move data from point A to point B. But AI just hit the reset button on our entire field. Now, we're all staring into the void, wondering what's next. While the fundamentals haven't changed, data remains challenging in the traditional areas of data governance, data management, and data modeling, which still present challenges. Everything else is up for grabs. This talk will cut through the noise and explore the future of data engineering in an AI-driven world. We'll examine how team structures will evolve, why agentic workflows and real-time systems are becoming non-negotiable, and how our focus must shift from building dashboards and analytics to architecting for automated action. The reset button has been pushed. It's time for us to invent the future of our industry.
There's no shortage of technical content for data engineers, but a massive gap exists when it comes to the non-technical skills required to advance beyond a senior role. I sit down with Yordan Ivanov, Head of Data Engineering and writer of "Data Gibberish," to talk about this disconnect. We dive into his personal journey of failing as a manager the first time, learning the crucial "people" skills, and his current mission to help data engineers learn how to speak the language of business. Key areas we explore: The Senior-Level Content Gap: Yordan explains why his non-technical content on career strategy and stakeholder communication gets "terrible" engagement compared to technical posts, even though it's what's needed to advance.The Managerial Trap: Yordan's candid story about his first attempt at management, where he failed because he cared only about code and wasn't equipped for the people-centric aspects and politics of the role.The Danger of AI Over-reliance: A deep discussion on how leaning too heavily on AI can prevent the development of fundamental thinking and problem-solving skills, both in coding and in life.The Maturing Data Landscape: We reflect on the end of the "modern data stack euphoria" and what the wave of acquisitions means for innovation and the future of data tooling.AI Adoption in Europe vs. the US: A look at how AI adoption is perceived as massive and mandatory in Europe, while US census data shows surprisingly low enterprise adoption rates
In this episode, I sit down with Saket Saurabh (CEO of Nexla) to discuss the fundamental shift happening in the AI landscape. The conversation is moving beyond the race to build the biggest foundational models and towards a new battleground: context. We explore what it means to be a "model company" versus a "context company" and how this changes everything for data strategy and enterprise AI.
Join us as we cover: Model vs. Context Companies: The emerging divide between companies building models (like OpenAI) and those whose advantage lies in their unique data and integrations. The Limits of Current Models: Why we might be hitting an asymptote with the current transformer architecture for solving complex, reliable business processes. "Context Engineering": What this term really means, from RAG to stitching together tools, data, and memory to feed AI systems. The Resurgence of Knowledge Graphs: Why graph databases are becoming critical for providing deterministic, reliable information to probabilistic AI models, moving beyond simple vector similarity. AI's Impact on Tooling: How tools like Lovable and Cursor are changing workflows for prototyping and coding, and the risk of creating the "-10x engineer." The Future of Data Engineering: How the field is expanding as AI becomes the primary consumer of data, requiring a new focus on architecture, semantics, and managing complexity at scale.
For years, data engineering was a story of predictable pipelines: move data from point A to point B. But AI just hit the reset button on our entire field. Now, we're all staring into the void, wondering what's next. While the fundamentals haven't changed, data remains challenging in the traditional areas of data governance, data management, and data modeling, which still present challenges. Everything else is up for grabs.
This talk will cut through the noise and explore the future of data engineering in an AI-driven world. We'll examine how team structures will evolve, why agentic workflows and real-time systems are becoming non-negotiable, and how our focus must shift from building dashboards and analytics to architecting for automated action. The reset button has been pushed. It's time for us to invent the future of our industry.
For years, data engineering was a story of predictable pipelines: move data from point A to point B. But AI just hit the reset button on our entire field. Now, we're all staring into the void, wondering what's next. While the fundamentals haven't changed, data remains challenging in the traditional areas of data governance, data management, and data modeling, which still present challenges. Everything else is up for grabs.
This talk will cut through the noise and explore the future of data engineering in an AI-driven world. We'll examine how team structures will evolve, why agentic workflows and real-time systems are becoming non-negotiable, and how our focus must shift from building dashboards and analytics to architecting for automated action. The reset button has been pushed. It's time for us to invent the future of our industry.
In this episode, I sit down with Ole to discuss his new book, "Fundamentals of Metadata Management." We move past the simple definition of "data about data" to a more nuanced view of metadata as something that exists in two places at once , serving as a pointer to find information elsewhere. Ole introduces his core concept of the "MetaGrid"—the interconnected, yet siloed, web of metadata repositories that already exists within every large organization across various teams and technologies. He argues that the key to better metadata management is not to build a new monolithic system but to recognize, document, and integrate the MetaGrid that's already there, hiding in plain sight. The conversation also covers the impact of the AI hype cycle , the lessons learned from the Data Mesh movement , the sociological incentives that help or hinder metadata projects , and the cultural clash between the worlds of data engineering and library science.
Matt Housley joins me to chat about whether it matters that AI is PhD level, clanker content (the new term for AI slop), a retrospective on Fundamentals of Data Engineering, and much more.
Tired of spending money on data courses you never finish? Here are 7 essential books that will actually boost your analytical skills, with no subscription required! Plus, make sure to tune in till the end as one lucky listener will get a free book from this list! Get the books here! DISCLAIMER: Some of the links in this video are affiliate links, meaning if you click through and make a purchase, I may earn a commission at no extra cost to you. Storytelling with Data by Cole Nussbaumer Knaflic 👉 https://amzn.to/3ZYHhsG Ace the Data Science Interview by Nick Singh and Kevin Huo 👉 https://amzn.to/3XZ9IaB Moneyball by Michael Lewis 👉 https://amzn.to/44fy4OD The StatQuest Illustrated Guide To Machine Learning by Josh Starmer 👉 https://amzn.to/40hRgu2 Fundamentals of Data Engineering by Joe Reis and Matt Housley 👉 https://amzn.to/3W84K8K Data Science for Business by Foster Provost and Tom Fawcett 👉 https://amzn.to/4k7jkaD The Big Book of Dashboards by Steve Wexler, Jeffrey Shaffer, and Andy Cotgreave 👉 https://amzn.to/462GJVj 💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com/interviewsimulator ⌚ TIMESTAMPS 00:16 Book 1: The Big Book of Dashboards 02:52 Book 2: Data Science for Business 04:38 Book 3: Fundamentals of Data Engineering 06:05 Book 4: The StatQuest Illustrated Guide To Machine Learning 07:52 Book 5: Moneyball 10:09 Book 6: Ace the Data Science Interview 11:24 Book 7: Storytelling With Data I've interviewed some of these awesome data authors! Check out these episodes! Stats You Need to Know as a Data Analyst (w/ StatQuest) 👉 https://datacareerpodcast.com/episode/105-do-you-have-to-be-good-at-statistics-to-be-a-data-analyst-w-statquest-josh-starmer-phd How to Ace The Data Science & Analytics Interview w/ Nick Singh 👉 https://datacareerpodcast.com/episode/74-how-to-ace-the-data-science-analytics-interview-w-nick-singh Meet The Woman Who Changed Data Storytelling Forever (Cole Knaflic) 👉 https://datacareerpodcast.com/episode/142-meet-the-woman-who-changed-data-storytelling-forever-cole-knafflic
🔗 CONNECT WITH AVERY 🎥 YouTube Channel: https://www.youtube.com/@averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!
To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more
If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.
👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa
There have been lots of social media posts declaring things to be dead - SQL, R, data engineering, BI, etc.
I give my thoughts on these proclamations, why it's a wrong way to think about our space, and more.
Matthew Scullion (CEO, Co-Founder of Matillion) joins me to chat about the future of data engineering, namely agentic data engineering teams.
What does this new world look like? Matthew shares some ideas of what he's building at Matillion, and the broader context of what agentic AI means for the data ecosystem, teams, and workflows.
Some people speculate that AI will make software and data engineers obsolete. If the only thing engineers do is write code, sure.
But we do a lot more than that, and I believe we'll actually need more engineers, not fewer.
In this episode, I discuss how I think AI will change the craft of software and data engineering. Spoiler - I think it will make it way more fun and productive.
Thanks to dbt and GoodData for sponsoring this episode. Please support them, as they're awesome.
dbt Launch Showcase Join dbt Labs May 28 for the dbt Launch Showcase to hear from executives and product leaders about the latest features landing in dbt. See firsthand how features will empower data practitioners and organizations in the age of AI.
GoodData Webinar Analytics and data engineering used to live in separate worlds—different teams, different tools, different goals. But the lines are blurring fast. As modern data products demand speed, scale, and seamless integration, the best teams are embracing engineering principles and best practices. In this no-BS conversation, Ryan Dolley, Matt Housley, and Joe Reis, dive into how engineering principles are transforming the way analytics is built, delivered, and scaled. 📆 May 27, 2025🕘 9:00 AM PDT, 12:00 PM EDT, 6:00 PM CEST🔗 Register here!
Juhani Vanhatapio lives in Finland at the Arctic Circle and is studying data engineering and machine learning. Juhani shares stories about life in the Arctic, the challenges and fun of guiding tours to see the Northern Lights, his AI Assistant for Northern Lights tourism, and his journey into the tech field during the COVID pandemic. This is definitely a very interesting and left-field conversation you'll enjoy.
People often ask me what I'd change in Fundamentals of Data Engineering. Usually, I reply "not much", as the the Data Engineering Lifecycle still remains intact. However, I see the role of data engineers shifting both left and right. What does this mean? Have a listen.
Willis Nana and I chat about the challenges of data engineering leadership, foundational skills, and his journey to a content creator on YouTube.#dataengineering #data #ai #datateam #leadership
Simon Späti and I discuss various aspects of writing, data engineering, and the impact of AI on the writing process. Simon shares his journey from business intelligence to data engineering and his current focus on writing. We also discuss the future of writing in the age of AI. Enjoy!
It's 2025! We made it! ;)
In this podcast, I rant about why data modeling matters more than ever, AI, and why humans will seek out "human" things in 2025 and beyond.
❤️ Your support means a lot. Please like and rate this podcast on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
It's December 31, 2024. Gordon Wong and I wrap up 2024 and chat about what we're excited about in 2025 in data and otherwise.
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
Matt Housley and I have a LONG chat about working in consulting, leaving your job, AI, the job market, our thoughts on what's coming in 2025, and much more.
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/
This morning, a great article came across my feed that gave me PTSD, asking if Iceberg is the Hadoop of the Modern Data Stack?
In this rant, I bring the discussion back to a central question you should ask with any hot technology - do you need it at all? Do you need a tool built for the top 1% of companies at a sufficient data scale? Or is a spreadsheet good enough?
Link: https://blog.det.life/apache-iceberg-the-hadoop-of-the-modern-data-stack-c83f63a4ebb9
❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.
🤓 My works:
📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering
🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/
🤓 My SubStack: https://joereis.substack.com/