It's Friday! Matt Housley and I catch up to discuss the aftermath of AWS re:Invent and why the industry’s obsession with AI Agents might be premature. We also dive deep into the hardware wars between Google and NVIDIA , the "brain-damaged" nature of current LLMs , and the growing "enshittification" of the internet and platforms like LinkedIn. Plus, I reveals some details about my upcoming "Mixed Model Arts" project.
talk-data.com
Activities tracked
16
What happens when a best-selling author and "recovering data scientist" gets a microphone? This podcast.
I'm Joe Reis, and each week I broadcast from wherever I am in the world, sharing candid thoughts on the data, tech, and AI industry.
Sometimes it's a solo rant. Other times, I'm chatting with the smartest people I know.
If you're looking for an unfiltered perspective on the state of AI, data, and tech, you've found it.
Top Topics
Sessions & talks
Showing 1–16 of 16 · Newest first
From Data Engineering to Context Engineering w/ Nick Schrock
Data engineering is undergoing a fundamental shift. In this episode, I sit down with Nick Schrock, founder and CTO of Dagster, to discuss why he went from being an "AI moderate" to believing 90% of code will be written by AI. Being hands on also led to a massive pivot in Dagster’s roadmap and a new focus on managing and engineering context. We dive deep into why simply feeding data to LLMs isn't enough. Nick explains why real-time context tools (like MCPs) can become "token hogs" that lack precision and why the future belongs to "context pipelines": offline, batch-computed context that is governed, versioned, and treated like code. We also explore Compass, Dagster’s new collaborative agent that lives in Slack, bridging the gap between business stakeholders and data teams. If you’re wondering how your role as a data engineer will evolve in an agentic world, this conversation maps out the territory Dagster: dagster.io Nick Schrock on X: @schrockn
Jeremiah Lowin, founder of Prefect , returns to the show to discuss the seismic shift in the data and AI landscape since our last conversation a few years ago. He shares the wild origin story of FastMCP, a project he started to create a more "Pythonic" wrapper for Anthropic's Model Context Protocol (MCP).
Jeremiah explains how this side project was incorporated into Anthropic's official SDK and then exploded to over a million downloads a day after MCP gained support from OpenAI and Google. He clarifies why this is an complementary expansion for Prefect, not a pivot , and provides a simple analogy for MCP as the "USB-C for AI agents". Most surprisingly, Jeremiah reveals that the primary adoption of MCP isn't for external products, but internally by data teams who are using it to finally fulfill the promise of the self-serve semantic layer and create a governable, "LLM-free zone" for AI tools.
The Rise of the Context Company: Reshaping Data Engineering with Saket Saurabh
In this episode, I sit down with Saket Saurabh (CEO of Nexla) to discuss the fundamental shift happening in the AI landscape. The conversation is moving beyond the race to build the biggest foundational models and towards a new battleground: context. We explore what it means to be a "model company" versus a "context company" and how this changes everything for data strategy and enterprise AI.
Join us as we cover: Model vs. Context Companies: The emerging divide between companies building models (like OpenAI) and those whose advantage lies in their unique data and integrations. The Limits of Current Models: Why we might be hitting an asymptote with the current transformer architecture for solving complex, reliable business processes. "Context Engineering": What this term really means, from RAG to stitching together tools, data, and memory to feed AI systems. The Resurgence of Knowledge Graphs: Why graph databases are becoming critical for providing deterministic, reliable information to probabilistic AI models, moving beyond simple vector similarity. AI's Impact on Tooling: How tools like Lovable and Cursor are changing workflows for prototyping and coding, and the risk of creating the "-10x engineer." The Future of Data Engineering: How the field is expanding as AI becomes the primary consumer of data, requiring a new focus on architecture, semantics, and managing complexity at scale.
Why AI Won't Fix Your Legacy Code & The Dangers of "Vibe Coding" w/ Marianne Bellotti
Is AI the silver bullet for modernizing our aging software systems, or is it a fast track to creating the next generation of unmaintainable "slopware"?In this episode, I sit down with Marianne Bellotti, author of the amazing book "Kill It With Fire," to discuss the complex reality of legacy system modernization in the age of AI. We explore why understanding the cultural and human history of a codebase is critical, and how the current AI hype cycle isn't a silver bullet for legacy IT modernization efforts.Marianne breaks down a recent disastrous "vibe coding" experiment, the risk of replacing simple human errors with catastrophic automated ones, and the massive disconnect between the promises of AI agents and the daily reality of a practitioner just trying to get a service account from IT.Join us for a pragmatic and no-BS conversation about the real challenges in software, the practical ways to leverage LLMs as an expert partner, and why good old-fashioned systems thinking is more important than ever.Find Marianne Bellotti:Socials: @BellmarWebsite: https://belladotte.tech/Book, "Kill It With Fire": https://nostarch.com/kill-it-fire
I had an interesting conversation yesterday with a young gentleman upgrading my Google Fiber. While he was originally pursuing a career as a software developer, he and his friends decided against it after seeing the progress of ChatGPT over the last couple of years.
As a father of two teenage boys, I often think about the nature of work, including whether writing code will be relevant for future generations. Here, I rant at least part (not all) of what's on my mind. This is a big topic, and you'll see me ranting more about it.
Vijay Yadav - GenAI-Ready Data
Vijay Yadav (Director of Data Science at Merck) joins me to chat about a very interesting project he launched at Merck involving LLMs in production. A big part of this discussion is how to make data ready for generative AI.
This is a great example of an LLM-native use case in production, which are rare right now. Lots to learn from here. Enjoy!
LinkedIn: https://www.linkedin.com/in/vijay-yadav-ds/
Last week I talked about how good you have to be at your job. Yesterday's OpenAI announcement of it's "reasoning" model, o1, got me thinking about how good AI needs to be to do our jobs.
Vinoo Ganesh - Strong Open Source Communities
Vinoo Ganesh is an open source enthusiast and contributor, and a data and ML engineer. We chat about strong open source communities, LLMs and AI, and much more.
Like most of you, I spent last weekend and the earlier part of this week following the OpenAI drama. Plot-twists galore! Every minute seemed like a new adventure. In the midst of the plot twists and turns, I noticed quite a few people saying, "This was all predictable", and then offering prognostications, most of which turned out to be very wrong. If even the OpenAI insiders couldn't figure out what was going on, how would the person on the street?
It's a good reminder that you need to approach the world with a sense of humility and try not to be a know-it-all. Don't be afraid to say "I don't know."
Matt Sharp & Chris Brousseau - Writing "LLMs in Production" (the midway edition)
Matt Sharp and Chris Brousseau join me to chat about writing their new book "LLMs in Production" (Manning). What's it like to write a book in a field that's changing at light speed? How do two people write a book together? We dive into this and much more.
Note - we recorded this outside at the Utah State Capitol. There's a bit of background noise, but it hopefully doesn't distract from the conversation. It was too nice of a day to be stuck inside :)
Michel Tricot - The Impact of AI on the Modern Data Stack
Michel Tricot (CEO of Airbyte) joins me to chat about the impact of AI on the modern data stack, ETL for AI, the challenges of moving from open source to a paid product, and much more.
Airbyte & Pinecone - https://airbyte.com/tutorials/chat-with-your-data-using-openai-pinecone-airbyte-and-langchain
Note from Joe - I had audio issues cuz he got a new computer and didn't use the correct mic :(
Juan Sequeda - The Power of Knowledge Graphs and LLMs on Structured Data in the Enterprise
Juan Sequeda and I chat about knowledge graphs (he's an OG in this area), the potential of LLMs on structured datasets, and much more. This is an honest, no-BS chat about the transition from a data-first world to a knowledge-first world. Enjoy!
LinkedIn: https://www.linkedin.com/in/juansequeda/
data.world: https://data.world/product/
website: https://www.juansequeda.com/
Whenever Kevin and I get together, we "nerd snipe" each other. This conversation is no different, and it's a wide-ranging conversation about how the data landscape evolves alongside LLMs, education, startup mentorship, and the possible (looming?) startup mass extinction.
Kevin's LinkedIn: https://www.linkedin.com/in/kevinzenghu/
Metaplane: https://metaplane.dev/
Paul Blankley & Ryan Janssen (Zenlytic) - How LLMs will Change Data and Analytics
Paul Blankley and Ryan Janssen are the co-founders of Zenlytic. They started a BI company with an LLM-first approach (back before LLM's were insanely cool). We talk about the future of BI, and how LLM's will change the face of data and analytics.
Zenlytic: https://www.zenlytic.com/
Paul's LinkedIn: https://www.linkedin.com/in/paulblankley/
Ryan's LinkedIn: https://www.linkedin.com/in/janssenryan/
If you like this show, give it a 5-star rating on your favorite podcast platform.
Purchase Fundamentals of Data Engineering at your favorite bookseller.
Subscribe to my Substack: https://joereis.substack.com/
ChatGPT was the iPhone moment for AI, and things are moving insanely quickly. What do generative AI models mean for us, especially children, who are arguably the last of the Pre-AI generation? I dive into some thoughts this week about how we need to work alongside the machines, the impact of generative AI on kids, and so on. Buckle up. We are in for a very interesting next few years as we sort out where AI fits into our day-to-day lives.
data #datascience #dataengineering #chatgpt #ai
If you like this show, give it a 5-star rating on your favorite podcast platform.
Purchase Fundamentals of Data Engineering at your favorite bookseller.
Check out my substack: https://joereis.substack.com/