talk-data.com
People (153 results)
See all 153 →Activities & events
| Title & Speakers | Event |
|---|---|
|
Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices
2025-12-15 · 11:30
During this session, we’ll take a practical look at the Analytics Engineer role: what it actually covers, how it fits into modern data teams, and which skills matter most. Rather than a step-by-step tutorial, this talk focuses on core concepts, real examples, and recurring patterns that define the work of Analytics Engineers today. We’ll cover:
The talk draws on key insights from Fundamentals of Analytics Engineering, with Juan sharing lessons learned while writing the book and working with data teams. By the end, you’ll have a grounded view of why the Analytics Engineer role exists, how it has evolved, and which capabilities are worth prioritizing if you want to advance in this career. About the speaker: Juan Manuel Perafan is an analytics engineer, educator, and community builder based in Utrecht. He’s the co-author of Fundamentals of Analytics Engineering, host of the SQL Lingua Franca podcast, and a dbt Community Award winner. Juan founded the Analytics Engineering Meetup Netherlands and the Dutch dbt Meetup, and has spoken at events like dbt Coalesce, Linux Foundation OS Summit, and Big Data Expo NL. **Join our slack: https://datatalks.club/slack.html** |
Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices
|
|
Being a founding engineer at an AI startup
2025-12-03 · 16:28
Gergely Orosz
– host
,
Michelle Lim
– Engineer
@ Warp
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. • Linear — The system for modern product development. — Michelle Lim joined Warp as engineer number one and is now building her own startup, Flint. She brings a strong product-first mindset shaped by her time at Facebook, Slack, Robinhood, and Warp. Michelle shares why she chose Warp over safer offers, how she evaluates early-stage opportunities, and what she believes distinguishes great founding engineers. Together, we cover how product-first engineers create value, why negotiating equity at early-stage startups requires a different approach, and why asking founders for references is a smart move. Michelle also shares lessons from building consumer and infrastructure products, how she thinks about tech stack choices, and how engineers can increase their impact by taking on work outside their job descriptions. If you want to understand what founders look for in early engineers or how to grow into a founding-engineer role, this episode is full of practical advice backed by real examples — Timestamps (00:00) Intro (01:32) How Michelle got into software engineering (03:30) Michelle’s internships (06:19) Learnings from Slack (08:48) Product learnings at Robinhood (12:47) Joining Warp as engineer #1 (22:01) Negotiating equity (26:04) Asking founders for references (27:36) The top reference questions to ask (32:53) The evolution of Warp’s tech stack (35:38) Product-first engineering vs. code-first (38:27) Hiring product-first engineers (41:49) Different types of founding engineers (44:42) How Flint uses AI tools (45:31) Avoiding getting burned in founder exits (49:26) Hiring top talent (50:15) An overview of Flint (56:08) Advice for aspiring founding engineers (1:01:05) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • Thriving as a founding engineer: lessons from the trenches • From software engineer to AI engineer • AI Engineering in the real world • The AI Engineering stack — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
The Pragmatic Engineer |
|
The Great Data Engineering Reset: From Pipelines to Agents and Beyond
2025-11-28 · 15:00
❄️ DSF WinterFest 2025: Global Online Summit ❄️ Join the global data celebration! Monday 24th to Friday 28th November 2025 Online \| 2-3 sessions per day \| Theme: Innovating with Data DSF WinterFest is back, and this year, it’s going global! Join our 50,000-strong community for a week of world-class talks, tutorials, and panels exploring how data, AI, and analytics are reshaping the world. Expect inspiring content, expert insights, and the cosy, welcoming DSF atmosphere we are known for, all from the comfort of your own space! Why join? 🌍 A global stage with speakers and attendees from every corner of the world 🎟️ One ticket for the full week. Register once and access every session 💻 Easy access from anywhere. Join live or catch replays in your own time ☕ Cosy community vibe. No travel, no stress, just data and connection 🎟️ Tickets: Choose your experience and secure your spot today: Free Pass - Watch live and enjoy replays until 30 November 2025 Upgrade at Checkout - Get extended replay access until May 2026 Register on our website to receive your joining links, add sessions to your calendar, and tune in live from anywhere in the world. Please note: Clicking “Attend” on Meetup does not register you for this summit. You must register via our website to receive your links. 🎁 Competition: We’re spreading festive cheer! One lucky attendee will win a £300 Amazon gift voucher (or equivalent in your currency). Find out more here. ❄️❄️❄️ Session details: 💡 The Great Data Engineering Reset: From Pipelines to Agents and Beyond 🗓️ Friday 28th November ⏰ 15:00 PM GMT 🗣️ Joe Reis, Data Engineer and Architect For years, data engineering was a story of predictable pipelines: move data from point A to point B. But AI just hit the reset button on our entire field. Now, we're all staring into the void, wondering what's next. While the fundamentals remain unchanged, data continues to pose challenges in traditional areas such as security, data governance, management, and modeling. Everything else is up for grabs. This talk will cut through the noise and explore the future of data engineering in an AI-driven world. We'll discuss how our focus must shift from building dashboards and analytics to architecting for automated action. The reset button has been pushed. It's time for us to invent the future of our industry. Joe Reis, a "recovering data scientist" with 20 years in the data industry, is the co-author of the best-selling O'Reilly book, "Fundamentals of Data Engineering." He’s also the instructor for the wildly popular Data Engineering Professional Certificate on Coursera, in partnership with DeepLearning.ai and AWS. Joe’s extensive experience encompasses data engineering\, data architecture\, machine learning\, and more. He regularly keynotes major data conferences globally\, advises and invests in innovative data product companies\, writes at [Practical Data Modeling](https://practicaldatamodeling.substack.com/) and his [personal blog](https://joereis.substack.com/)\, and hosts the popular data podcast "[The Joe Reis Show.](https://open.spotify.com/show/3mcKitYGS4VMG2eHd2PfDN?si=3d3fde23fc1c4a33)" In his free time\, Joe is dedicated to writing new books and articles and thinking of ways to advance the data industry. ❄️❄️❄️ 🔗 How to join: Once registered, you’ll receive your unique joining link by email, plus handy reminders one week, one day, and one hour before each session. Don't forget to add the sessions you are attending to your calendar. If you can’t make it live, don’t worry, your ticket includes replay access until 30 November 2025 (or May 2026 with the upgrade). 📘 Reminders: Time zones: All sessions are listed in GMT - please check your local time when registering. Recordings: Access replays until 30 November 2025 with a free pass, or until May 2026 with an upgraded ticket Please note: Clicking “Attend” on Meetup does not register you for this summit. You must register via our website to receive your links. Join the Celebration ❄️ Five days. Global speakers. Cutting-edge insights. Free to join live - replays included. Upgrade for extended access. Register now and be part of the global data community shaping the future. #DSFWinterFest |
The Great Data Engineering Reset: From Pipelines to Agents and Beyond
|
|
Code security for software engineers
2025-11-26 · 16:45
Johannes Dahse
– VP of Code Security
@ Sonar
,
Gergely Orosz
– host
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Statsig are helping make the first-ever Pragmatic Summit a reality. Join me and 400 other top engineers and leaders on 11 February, in San Francisco for a special one-day event. Reserve your spot here. • Linear — The system for modern product development. Engineering teams today move much faster, thanks to AI. Because of this, coordination increasingly becomes a problem. This is where Linear helps fast-moving teams stay focused. Check out Linear. — As software engineers, what should we know about writing secure code? Johannes Dahse is the VP of Code Security at Sonar and a security expert with 20 years of industry experience. In today’s episode of The Pragmatic Engineer, he joins me to talk about what security teams actually do, what developers should own, and where real-world risk enters modern codebases. We cover dependency risk, software composition analysis, CVEs, dynamic testing, and how everyday development practices affect security outcomes. Johannes also explains where AI meaningfully helps, where it introduces new failure modes, and why understanding the code you write and ship remains the most reliable defense. If you build and ship software, this episode is a practical guide to thinking about code security under real-world engineering constraints. — Timestamps (00:00) Intro (02:31) What is penetration testing? (06:23) Who owns code security: devs or security teams? (14:42) What is code security? (17:10) Code security basics for devs (21:35) Advanced security challenges (24:36) SCA testing (25:26) The CVE Program (29:39) The State of Code Security report (32:02) Code quality vs security (35:20) Dev machines as a security vulnerability (37:29) Common security tools (42:50) Dynamic security tools (45:01) AI security reviews: what are the limits? (47:51) AI-generated code risks (49:21) More code: more vulnerabilities (51:44) AI’s impact on code security (58:32) Common misconceptions of the security industry (1:03:05) When is security “good enough?” (1:05:40) Johannes’s favorite programming language — The Pragmatic Engineer deepdives relevant for this episode: • What is Security Engineering? • Mishandled security vulnerability in Next.js • Okta Schooled on Its Security Practices — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
The Pragmatic Engineer |
|
Building a multimodal lakehouse for AI (w/ Chang She)
2025-11-23 · 14:56
Chang She
– CEO
@ LanceDB
,
Tristan Handy
– CEO
@ dbt Labs
In this episode, Tristan Handy sits down with Chang She — a co-creator of Pandas and now CEO of LanceDB — to explore the convergence of analytics and AI engineering. The team at LanceDB is rebuilding the data lake from the ground up with AI as a first principle, starting with a new AI-native file format called Lance. Tristan traces Chang's journey as one of the original contributors to the pandas library to building a new infrastructure layer for AI-native data. Learn why vector databases alone aren't enough, why agents require new architecture, and how LanceDB is building a AI lakehouse for the future. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs. |
The Analytics Engineering Podcast |
|
How AI will change software engineering – with Martin Fowler
2025-11-19 · 17:09
Gergely Orosz
– host
,
Martin Fowler
– Chief Scientist
@ Thoughtworks
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. AI-accelerated development isn’t just about shipping faster: it’s about measuring whether, what you ship, actually delivers value. This is where modern experimentation with Statsig comes in. Check it out. • Linear — The system for modern product development. I had a jaw-dropping experience when I dropped in for the weekly “Quality Wednesdays” meeting at Linear. Every week, every dev fixes at least one quality isse, large or small. Even if it’s one pixel misalignment, like this one. I’ve yet to see a team obsess this much about quality. Read more about how Linear does Quality Wednesdays – it’s fascinating! — Martin Fowler is one of the most influential people within software architecture, and the broader tech industry. He is the Chief Scientist at Thoughtworks and the author of Refactoring and Patterns of Enterprise Application Architecture, and several other books. He has spent decades shaping how engineers think about design, architecture, and process, and regularly publishes on his blog, MartinFowler.com. In this episode, we discuss how AI is changing software development: the shift from deterministic to non-deterministic coding; where generative models help with legacy code; and the narrow but useful cases for vibe coding. Martin explains why LLM output must be tested rigorously, why refactoring is more important than ever, and how combining AI tools with deterministic techniques may be what engineering teams need. We also revisit the origins of the Agile Manifesto and talk about why, despite rapid changes in tooling and workflows, the skills that make a great engineer remain largely unchanged. — Timestamps (00:00) Intro (01:50) How Martin got into software engineering (07:48) Joining Thoughtworks (10:07) The Thoughtworks Technology Radar (16:45) From Assembly to high-level languages (25:08) Non-determinism (33:38) Vibe coding (39:22) StackOverflow vs. coding with AI (43:25) Importance of testing with LLMs (50:45) LLMs for enterprise software (56:38) Why Martin wrote Refactoring (1:02:15) Why refactoring is so relevant today (1:06:10) Using LLMs with deterministic tools (1:07:36) Patterns of Enterprise Application Architecture (1:18:26) The Agile Manifesto (1:28:35) How Martin learns about AI (1:34:58) Advice for junior engineers (1:37:44) The state of the tech industry today (1:42:40) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • Vibe coding as a software engineer • The AI Engineering stack • AI Engineering in the real world • What changed in 50 years of computing — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
|
|
Netflix’s Engineering Culture
2025-11-12 · 17:40
Elizabeth Stone
– Chief Technology Officer
@ Netflix
,
Gergely Orosz
– host
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Statsig enables two cultures at once: continuous shipping and experimentation. Companies like Notion went from single-digit experiments per quarter to over 300 experiments with Statsig. Start using Statsig with a generous free tier, and a $50K startup program. • Linear — The system for modern product development. When most companies hit real scale, they start to slow down, and are faced with “process debt.” This often hits software engineers the most. Companies switch to Linear to hit a hard reset on this process debt – ones like Scale cut their bug resolution in half after the switch. Check out Linear’s migration guide for details. — What’s it like to work as a software engineer inside one of the world’s biggest streaming companies? In this special episode recorded at Netflix’s headquarters in Los Gatos, I sit down with Elizabeth Stone, Netflix’s Chief Technology Officer. Before becoming CTO, Elizabeth led data and insights at Netflix and was VP of Science at Lyft. She brings a rare mix of technical depth, product thinking, and people leadership. We discuss what it means to be “unusually responsible” at Netflix, how engineers make decisions without layers of approval, and how the company balances autonomy with guardrails for high-stakes projects like Netflix Live. Elizabeth shares how teams self-reflect and learn from outages and failures, why Netflix doesn’t do formal performance reviews, and what new grads bring to a company known for hiring experienced engineers. This episode offers a rare inside look at how Netflix engineers build, learn, and lead at a global scale. — Timestamps (00:00) Intro (01:44) The scale of Netflix (03:31) Production software stack (05:20) Engineering challenges in production (06:38) How the Open Connect delivery network works (08:30) From pitch to play (11:31) How Netflix enables engineers to make decisions (13:26) Building Netflix Live for global sports (16:25) Learnings from Paul vs. Tyson for NFL Live (17:47) Inside the control room (20:35) What being unusually responsible looks like (24:15) Balancing team autonomy with guardrails for Live (30:55) The high talent bar and introduction of levels at Netflix (36:01) The Keeper Test (41:27) Why engineers leave or stay (44:27) How AI tools are used at Netflix (47:54) AI’s highest-impact use cases (50:20) What new grads add and why senior talent still matters (53:25) Open source at Netflix (57:07) Elizabeth’s parting advice for new engineers to succeed at Netflix — The Pragmatic Engineer deepdives relevant for this episode: • The end of the senior-only level at Netflix • Netflix revamps its compensation philosophy • Live streaming at world-record scale with Ashutosh Agrawal • Shipping to production • What is good software architecture? — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
|
|
The AI Data Paradox: High Trust in Models, Low Trust in Data
2025-11-09 · 23:53
Ariel Pohoryles
– guest
@ Rivery
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a recent survey of 300 data leaders on how organizations are investing in data to scale AI. He shares a paradox uncovered in the research: while 77% of leaders trust the data feeding their AI systems, only 50% trust their organization's data overall. Ariel explains why truly productionizing AI demands broader, continuously refreshed data with stronger automation and governance, and highlights the challenges posed by unstructured data and vector stores. The conversation covers the need to shift from manual reviews to automated pipelines, the resurgence of metadata and master data management, and the importance of guardrails, traceability, and agent governance. Ariel also predicts a growing convergence between data teams and application integration teams and advises leaders to focus on high-value use cases, aggressive pipeline automation, and cataloging and governing the coming sprawl of AI agents, all while using AI to accelerate data engineering itself. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Ariel Pohoryles about data management investments that organizations are making to enable them to scale AI implementationsInterview IntroductionHow did you get involved in the area of data management?Can you start by describing the motivation and scope of your recent survey on data management investments for AI across your respondents?What are the key takeaways that were most significant to you?The survey reveals a fascinating paradox: 77% of leaders trust the data used by their AI systems, yet only half trust their organization's overall data quality. For our data engineering audience, what does this suggest about how companies are currently sourcing data for AI? Does it imply they are using narrow, manually-curated "golden datasets," and what are the technical challenges and risks of that approach as they try to scale?The report highlights a heavy reliance on manual data quality processes, with one expert noting companies feel it's "not reliable to fully automate validation" for external or customer data. At the same time, maturity in "Automated tools for data integration and cleansing" is low, at only 42%. What specific technical hurdles or organizational inertia are preventing teams from adopting more automation in their data quality and integration pipelines?There was a significant point made that with generative AI, "biases can scale much faster," making automated governance essential. From a data engineering perspective, how does the data management strategy need to evolve to support generative AI versus traditional ML models? What new types of data quality checks, lineage tracking, or monitoring for feedback loops are required when the model itself is generating new content based on its own outputs?The report champions a "centralized data management platform" as the "connective tissue" for reliable AI. How do you see the scale and data maturity impacting the realities of that effort?How do architectural patterns in the shape of cloud warehouses, lakehouses, data mesh, data products, etc. factor into that need for centralized/unified platforms?A surprising finding was that a third of respondents have not fully grasped the risk of significant inaccuracies in their AI models if they fail to prioritize data management. In your experience, what are the biggest blind spots for data and analytics leaders?Looking at the maturity charts, companies rate themselves highly on "Developing a data management strategy" (65%) but lag significantly in areas like "Automated tools for data integration and cleansing" (42%) and "Conducting bias-detection audits" (24%). If you were advising a data engineering team lead based on these findings, what would you tell them to prioritize in the next 6-12 months to bridge the gap between strategy and a truly scalable, trustworthy data foundation for AI?The report states that 83% of companies expect to integrate more data sources for their AI in the next year. For a data engineer on the ground, what is the most important capability they need to build into their platform to handle this influx?What are the most interesting, innovative, or unexpected ways that you have seen teams addressing the new and accelerated data needs for AI applications?What are some of the noteworthy trends or predictions that you have for the near-term future of the impact that AI is having or will have on data teams and systems?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links BoomiData ManagementIntegration & Automation DemoAgentstudioData Connector Agent WebinarSurvey ResultsData GovernanceShadow ITPodcast EpisodeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
From Swift to Mojo and high-performance AI Engineering with Chris Lattner
2025-11-05 · 16:00
Chris Lattner
– guest
,
Gergely Orosz
– host
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Companies like Graphite, Notion, and Brex rely on Statsig to measure the impact of the pace they ship. Get a 30-day enterprise trial here. • Linear – The system for modern product development. Linear is a heavy user of Swift: they just redesigned their native iOS app using their own take on Apple’s Liquid Glass design language. The new app is about speed and performance – just like Linear is. Check it out. — Chris Lattner is one of the most influential engineers of the past two decades. He created the LLVM compiler infrastructure and the Swift programming language – and Swift opened iOS development to a broader group of engineers. With Mojo, he’s now aiming to do the same for AI, by lowering the barrier to programming AI applications. I sat down with Chris in San Francisco, to talk language design, lessons on designing Swift and Mojo, and – of course! – compilers. It’s hard to find someone who is as enthusiastic and knowledgeable about compilers as Chris is! We also discussed why experts often resist change even when current tools slow them down, what he learned about AI and hardware from his time across both large and small engineering teams, and why compiler engineering remains one of the best ways to understand how software really works. — Timestamps (00:00) Intro (02:35) Compilers in the early 2000s (04:48) Why Chris built LLVM (08:24) GCC vs. LLVM (09:47) LLVM at Apple (19:25) How Chris got support to go open source at Apple (20:28) The story of Swift (24:32) The process for designing a language (31:00) Learnings from launching Swift (35:48) Swift Playgrounds: making coding accessible (40:23) What Swift solved and the technical debt it created (47:28) AI learnings from Google and Tesla (51:23) SiFive: learning about hardware engineering (52:24) Mojo’s origin story (57:15) Modular’s bet on a two-level stack (1:01:49) Compiler shortcomings (1:09:11) Getting started with Mojo (1:15:44) How big is Modular, as a company? (1:19:00) AI coding tools the Modular team uses (1:22:59) What kind of software engineers Modular hires (1:25:22) A programming language for LLMs? No thanks (1:29:06) Why you should study and understand compilers — The Pragmatic Engineer deepdives relevant for this episode: • AI Engineering in the real world • The AI Engineering stack • Uber's crazy YOLO app rewrite, from the front seat • Python, Go, Rust, TypeScript and AI with Armin Ronacher • Microsoft’s developer tools roots — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
|
|
Beyond Vibe Coding with Addy Osmani
2025-10-29 · 16:36
Gergely Orosz
– host
,
Addy Osmani
– Head of Chrome Developer Experience
@ Google
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. • Linear – The system for modern product development. — Addy Osmani is Head of Chrome Developer Experience at Google, where he leads teams focused on improving performance, tooling, and the overall developer experience for building on the web. If you’ve ever opened Chrome’s Developer Tools bar, you’ve definitely used features Addy has built. He’s also the author of several books, including his latest, Beyond Vibe Coding, which explores how AI is changing software development. In this episode of The Pragmatic Engineer, I sit down with Addy to discuss how AI is reshaping software engineering workflows, the tradeoffs between speed and quality, and why understanding generated code remains critical. We dive into his article The 70% Problem, which explains why AI tools accelerate development but struggle with the final 30% of software quality—and why this last 30% is tackled easily by software engineers who understand how the system actually works. — Timestamps (00:00) Intro (02:17) Vibe coding vs. AI-assisted engineering (06:07) How Addy uses AI tools (13:10) Addy’s learnings about applying AI for development (18:47) Addy’s favorite tools (22:15) The 70% Problem (28:15) Tactics for efficient LLM usage (32:58) How AI tools evolved (34:29) The case for keeping expectations low and control high (38:05) Autonomous agents and working with them (42:49) How the EM and PM role changes with AI (47:14) The rise of new roles and shifts in developer education (48:11) The importance of critical thinking when working with AI (54:08) LLMs as a tool for learning (1:03:50) Rapid questions — The Pragmatic Engineer deepdives relevant for this episode: • Vibe Coding as a software engineer • How AI-assisted coding will change software engineering: hard truths • AI Engineering in the real world • The AI Engineering stack • How Claude Code is built — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
|
|
The True Costs of Legacy Systems: Technical Debt, Risk, and Exit Strategies
2025-10-18 · 22:35
Kate Shaw
– Senior Product Manager for Data and SLIM
@ SnapLogic
,
Tobias Macey
– host
Summary In this episode Kate Shaw, Senior Product Manager for Data and SLIM at SnapLogic, talks about the hidden and compounding costs of maintaining legacy systems—and practical strategies for modernization. She unpacks how “legacy” is less about age and more about when a system becomes a risk: blocking innovation, consuming excess IT time, and creating opportunity costs. Kate explores technical debt, vendor lock-in, lost context from employee turnover, and the slippery notion of “if it ain’t broke,” especially when data correctness and lineage are unclear. Shee digs into governance, observability, and data quality as foundations for trustworthy analytics and AI, and why exit strategies for system retirement should be planned from day one. The discussion covers composable architectures to avoid monoliths and big-bang migrations, how to bridge valuable systems into AI initiatives without lock-in, and why clear success criteria matter for AI projects. Kate shares lessons from the field on discovery, documentation gaps, parallel run strategies, and using integration as the connective tissue to unlock data for modern, cloud-native and AI-enabled use cases. She closes with guidance on planning migrations, defining measurable outcomes, ensuring lineage and compliance, and building for swap-ability so teams can evolve systems incrementally instead of living with a “bowl of spaghetti.” Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Kate Shaw about the true costs of maintaining legacy systemsInterview IntroductionHow did you get involved in the area of data management?What are your crtieria for when a given system or service transitions to being "legacy"?In order for any service to survive long enough to become "legacy" it must be serving its purpose and providing value. What are the common factors that prompt teams to deprecate or migrate systems?What are the sources of monetary cost related to maintaining legacy systems while they remain operational?Beyond monetary cost, economics also have a concept of "opportunity cost". What are some of the ways that manifests in data teams who are maintaining or migrating from legacy systems?How does that loss of productivity impact the broader organization?How does the process of migration contribute to issues around data accuracy, reliability, etc. as well as contributing to potential compromises of security and compliance?Once a system has been replaced, it needs to be retired. What are some of the costs associated with removing a system from service?What are the most interesting, innovative, or unexpected ways that you have seen teams address the costs of legacy systems and their retirement?What are the most interesting, unexpected, or challenging lessons that you have learned while working on legacy systems migration?When is deprecation/migration the wrong choice?How have evolutionary architecture patterns helped to mitigate the costs of system retirement?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SnapLogicSLIM == SnapLogic Intelligent ModernizerOpportunity CostSunk Cost FallacyData GovernanceEvolutionary ArchitectureThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
Google’s engineering culture
2025-10-15 · 20:32
Gergely Orosz
– host
,
Elin Nilsson
– tech industry researcher
@ The Pragmatic Engineer
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Something interesting is happening with the latest generation of tech giants. Rather than building advanced experimentation tools themselves, companies like Anthropic, Figma, Notion and a bunch of others… are just using Statsig. Statsig has rebuilt this entire suite of data tools that was available at maybe 10 or 15 giants until now. Check out Statsig. • Linear – The system for modern product development. Linear is just so fast to use – and it enables velocity in product workflows. Companies like Perplexity and OpenAI have already switched over, because simplicity scales. Go ahead and check out Linear and see why it feels like a breeze to use. — What is it really like to be an engineer at Google? In this special deep dive episode, we unpack how engineering at Google actually works. We spent months researching the engineering culture of the search giant, and talked with 20+ current and former Googlers to bring you this deepdive with Elin Nilsson, tech industry researcher for The Pragmatic Engineer and a former Google intern. Google has always been an engineering-driven organization. We talk about its custom stack and tools, the design-doc culture, and the performance and promotion systems that define career growth. We also explore the culture that feels built for engineers: generous perks, a surprisingly light on-call setup often considered the best in the industry, and a deep focus on solving technical problems at scale. If you are thinking about applying to Google or are curious about how the company’s engineering culture has evolved, this episode takes a clear look at what it was like to work at Google in the past versus today, and who is a good fit for today’s Google. Jump to interesting parts: (13:50) Tech stack (1:05:08) Performance reviews (GRAD) (2:07:03) The culture of continuously rewriting things — Timestamps (00:00) Intro (01:44) Stats about Google (11:41) The shared culture across Google (13:50) Tech stack (34:33) Internal developer tools and monorepo (43:17) The downsides of having so many internal tools at Google (45:29) Perks (55:37) Engineering roles (1:02:32) Levels at Google (1:05:08) Performance reviews (GRAD) (1:13:05) Readability (1:16:18) Promotions (1:25:46) Design docs (1:32:30) OKRs (1:44:43) Googlers, Nooglers, ReGooglers (1:57:27) Google Cloud (2:03:49) Internal transfers (2:07:03) Rewrites (2:10:19) Open source (2:14:57) Culture shift (2:31:10) Making the most of Google, as an engineer (2:39:25) Landing a job at Google — The Pragmatic Engineer deepdives relevant for this episode: • Inside Google’s engineering culture • Oncall at Google • Performance calibrations at tech companies • Promotions and tooling at Google • How Kubernetes is built • The man behind the Big Tech comics: Google cartoonist Manu Cornet — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
The Pragmatic Engineer |
|
Context Engineering as a Discipline: Building Governed AI Analytics
2025-10-11 · 21:36
Nick Schrock
– guest
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loopInterview IntroductionHow did you get involved in the area of data management?Can you describe what Compass is and the story behind it?context repository structurehow to keep it relevant/avoid sprawl/duplicationproviding guardrailshow does a tool like Compass help provide feedback/insights back to the data teams?preparing the data warehouse for effective introspection by the AILLM selectioncost managementcaching/materializing ad-hoc queriesWhy Slack and enterprise chat are important to b2b softwareHow AI is changing stakeholder relationshipsHow not to overpromise AI capabilities How does Compass relate to BI?How does Compass relate to Dagster and Data Infrastructure?What are the most interesting, innovative, or unexpected ways that you have seen Compass used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?When is Compass the wrong choice?What do you have planned for the future of Compass?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DagsterDagster LabsDagster PlusDagster CompassChris Bergh DataOps EpisodeRise of Medium Code blog postContext EngineeringData StewardInformation ArchitectureConway's LawTemporal durable execution frameworkThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
|
|
The Data Model That Captures Your Business: Metric Trees Explained
2025-10-05 · 23:59
Vijay Subramanian
– Founder and CEO
@ Trace
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data practices at Rent the Runway and explains how the modern data stack has led to a proliferation of dashboards without a coherent way for business consumers to reason about cause, effect, and action. He explores how metric trees differ from and interoperate with other data modeling approaches, serve as a backend for analytical workflows, and provide concrete examples like modeling Uber's revenue drivers and customer journeys. Vijay also discusses the potential of AI agents operating on metric trees to execute workflows, organizational patterns for defining inputs and outputs with business teams, and a vision for analytics that becomes invisible infrastructure embedded in everyday decisions. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Vijay Subramanian about metric trees and how they empower more effective and adaptive analyticsInterview IntroductionHow did you get involved in the area of data management?Can you describe what metric trees are and their purpose?How do metric trees relate to metric/semantic layers?What are the shortcomings of existing data modeling frameworks that prevent effective use of those assets?How do metric trees build on top of existing investments in dimensional data models?What are some strategies for engaging with the business to identify metrics and their relationships?What are your recommendations for storage, representation, and retrieval of metric trees?How do metric trees fit into the overall lifecycle of organizational data workflows?When creating any new data asset it introduces overhead of maintenance, monitoring, and evolution. How do metric trees fit into existing testing and validation frameworks that teams rely on for dimensional modeling?What are some of the key differences in useful evaluation/testing that teams need to develop for metric trees?How do metric trees assist in context engineering for AI-powered self-serve access to organizational data?What are the most interesting, innovative, or unexpected ways that you have seen metric trees used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on metric trees and operationalizing them at Trace?When is a metric tree the wrong abstraction?What do you have planned for the future of Trace and applications of metric trees?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links Metric TreeTraceModern Data StackHadoopVerticaLuigidbtRalph KimballBill InmonMetric LayerDimensional Data WarehouseMaster Data ManagementData GovernanceFinancial P&L (Profit and Loss)EBITDA ==Earnings before interest, taxes, depreciation and amortizationThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
|
|
Hypergrowth startups: Uber and CloudKitchens with Charles-Axel Dein
2025-09-24 · 16:30
Gergely Orosz
– host
,
Charles-Axel Dein
– engineer
@ Uber
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Statsig built a complete set of data tools that allow engineering teams to measure the impact of their work. This toolkit is SO valuable to so many teams, that OpenAI - who was a huge user of Statsig - decided to acquire the company, the news announced last week. Talk about validation! Check out Statsig. • Linear – The system for modern product development. Here’s an interesting story: OpenAI switched to Linear as a way to establish a shared vocabulary between teams. Every project now follows the same lifecycle, uses the same labels, and moves through the same states. Try Linear for yourself. — What does it take to do well at a hyper-growth company? In this episode of The Pragmatic Engineer, I sit down with Charles-Axel Dein, one of the first engineers at Uber, who later hired me there. Since then, he’s gone on to work at CloudKitchens. He’s also been maintaining the popular Professional programming reading list GitHub repo for 15 years, where he collects articles that made him a better programmer. In our conversation, we dig into what it’s really like to work inside companies that grow rapidly in scale and headcount. Charles shares what he’s learned about personal productivity, project management, incidents, interviewing, plus how to build flexible skills that hold up in fast-moving environments. Jump to interesting parts: • 10:41 – the reality of working inside a hyperscale company • 41:10 – the traits of high-performing engineers • 1:03:31 – Charles’ advice for getting hired in today’s job market We also discuss: • How to spot the signs of hypergrowth (and when it’s slowing down) • What sets high-performing engineers apart beyond shipping • Charles’s personal productivity tips, favorite reads, and how he uses reading to uplevel his skills • Strategic tips for building your resume and interviewing • How imposter syndrome is normal, and how leaning into it helps you grow • And much more! If you’re at a fast-growing company, considering joining one, or looking to land your next role, you won’t want to miss this practical advice on hiring, interviewing, productivity, leadership, and career growth. — Timestamps (00:00) Intro (04:04) Early days at Uber as engineer #20 (08:12) CloudKitchens’ similarities with Uber (10:41) The reality of working at a hyperscale company (19:05) Tenancies and how Uber deployed new features (22:14) How CloudKitchens handles incidents (26:57) Hiring during fast-growth (34:09) Avoiding burnout (38:55) The popular Professional programming reading list repo (41:10) The traits of high-performing engineers (53:22) Project management tactics (1:03:31) How to get hired as a software engineer (1:12:26) How AI is changing hiring (1:19:26) Unexpected ways to thrive in fast-paced environments (1:20:45) Dealing with imposter syndrome (1:22:48) Book recommendations (1:27:26) The problem with survival bias (1:32:44) AI’s impact on software development (1:42:28) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • Software engineers leading projects • The Platform and Program split at Uber • Inside Uber’s move to the Cloud • How Uber built its observability platform • From Software Engineer to AI Engineer – with Janvi Kalra — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected]. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe |
The Pragmatic Engineer |
|
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
2025-09-12 · 19:00
Join us for day three in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics. Date and Time Sept 12 at 9 AM Pacific Location Virtual. Register for the Zoom! Towards Robotics Foundation Models that Can Reason In recent years, we have witnessed remarkable progress in generative AI, particularly in language and visual understanding and generation. This leap has been fueled by unprecedentedly large image–text datasets and the scaling of large language and vision models trained on them. Increasingly, these advances are being leveraged to equip and empower robots with open-world visual understanding and reasoning capabilities. Yet, despite these advances, scaling such models for robotics remains challenging due to the scarcity of large-scale, high-quality robot interaction data, limiting their ability to generalize and truly reason about actions in the real world. Nonetheless, promising results are emerging from using multimodal large language models (MLLMs) as the backbone of robotic systems, especially in enabling the acquisition of low-level skills required for robust deployment in everyday household settings. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2’s flagship robotic foundation model for action reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks. About the Speaker Jiafei Duan is a Ph.D. candidate in Computer Science & Engineering at the University of Washington, advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in MIT Technology Review, GreekWire, VentureBeat, and Business Wire. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection In this talk, I will share our recent research efforts in visual industrial anomaly detection. It will present a comprehensive empirical analysis with a focus on real-world applications, demonstrating that recent SOTA methods perform worse than methods from 2021 when evaluated on a variety of datasets. We will also investigate how different practical aspects, such as input size, distribution shift, data contamination, and having a validation set, affect the results. About the Speaker Aimira Baitieva is a Research Engineer at Valeo, where she works primarily on computer vision problems. Her recent work has been focused on deep learning anomaly detection for automating visual inspection, incorporating both research and practical applications in the manufacturing sector. The Digital Reasoning Thread in Manufacturing: Orchestrating Vision, Simulation, and Robotics Manufacturing is entering a new phase where AI is no longer confined to isolated tasks like defect detection or predictive maintenance. Advances in reasoning AI, simulation, and robotics are converging to create end-to-end systems that can perceive, decide, and act – in both digital and physical environments. This talk introduces the Digital Reasoning Thread – a consistent layer of AI reasoning that runs through every stage of manufacturing, connecting visual intelligence, digital twins, simulation environments, and robotic execution. By linking perception with advanced reasoning and action, this approach enables faster, higher-quality decisions across the entire value chain. We will explore real-world examples of applying reasoning AI in industrial settings, combining simulation-driven analysis, orchestration frameworks, and the foundations needed for robotic execution in the physical world. Along the way, we will examine the key technical building blocks – from data pipelines and interoperability standards to agentic AI architectures – that make this level of integration possible. Attendees will gain a clear understanding of how to bridge AI-driven perception with simulation and robotics, and what it takes to move from isolated pilots to orchestrated, autonomous manufacturing systems. About the Speaker Vlad Larichev is an Industrial AI Lead at Accenture Industry X, specializing in applying AI, generative AI, and agentic AI to engineering, manufacturing, and large-scale industrial operations. With a background as an engineer, solution architect, and software developer, he has led AI initiatives across sectors including automotive, energy, and consumer goods, integrating advanced analytics, computer vision, and simulation into complex industrial environments. Vlad is the creator of the Digital Reasoning Thread – a framework for connecting AI reasoning across visual intelligence, simulation, and physical execution. He is an active public speaker, podcast host, and community builder, sharing practical insights on scaling AI from pilot projects to enterprise-wide adoption. The Road to Useful Robots This talk explores the current state of AI-enabled robots and the issues with deploying more advanced models on constrained hardware, including limited compute and power budgets. It then moves on to what's next for developing useful, intelligent robots. About the Speaker Michael Hart, also known as Mike Likes Robots. is a robotics software engineer and content creator. His mission is to share knowledge to accelerate robotics. @mikelikesrobots |
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
|
|
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
2025-09-12 · 19:00
Join us for day three in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics. Date and Time Sept 12 at 9 AM Pacific Location Virtual. Register for the Zoom! Towards Robotics Foundation Models that Can Reason In recent years, we have witnessed remarkable progress in generative AI, particularly in language and visual understanding and generation. This leap has been fueled by unprecedentedly large image–text datasets and the scaling of large language and vision models trained on them. Increasingly, these advances are being leveraged to equip and empower robots with open-world visual understanding and reasoning capabilities. Yet, despite these advances, scaling such models for robotics remains challenging due to the scarcity of large-scale, high-quality robot interaction data, limiting their ability to generalize and truly reason about actions in the real world. Nonetheless, promising results are emerging from using multimodal large language models (MLLMs) as the backbone of robotic systems, especially in enabling the acquisition of low-level skills required for robust deployment in everyday household settings. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2’s flagship robotic foundation model for action reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks. About the Speaker Jiafei Duan is a Ph.D. candidate in Computer Science & Engineering at the University of Washington, advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in MIT Technology Review, GreekWire, VentureBeat, and Business Wire. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection In this talk, I will share our recent research efforts in visual industrial anomaly detection. It will present a comprehensive empirical analysis with a focus on real-world applications, demonstrating that recent SOTA methods perform worse than methods from 2021 when evaluated on a variety of datasets. We will also investigate how different practical aspects, such as input size, distribution shift, data contamination, and having a validation set, affect the results. About the Speaker Aimira Baitieva is a Research Engineer at Valeo, where she works primarily on computer vision problems. Her recent work has been focused on deep learning anomaly detection for automating visual inspection, incorporating both research and practical applications in the manufacturing sector. The Digital Reasoning Thread in Manufacturing: Orchestrating Vision, Simulation, and Robotics Manufacturing is entering a new phase where AI is no longer confined to isolated tasks like defect detection or predictive maintenance. Advances in reasoning AI, simulation, and robotics are converging to create end-to-end systems that can perceive, decide, and act – in both digital and physical environments. This talk introduces the Digital Reasoning Thread – a consistent layer of AI reasoning that runs through every stage of manufacturing, connecting visual intelligence, digital twins, simulation environments, and robotic execution. By linking perception with advanced reasoning and action, this approach enables faster, higher-quality decisions across the entire value chain. We will explore real-world examples of applying reasoning AI in industrial settings, combining simulation-driven analysis, orchestration frameworks, and the foundations needed for robotic execution in the physical world. Along the way, we will examine the key technical building blocks – from data pipelines and interoperability standards to agentic AI architectures – that make this level of integration possible. Attendees will gain a clear understanding of how to bridge AI-driven perception with simulation and robotics, and what it takes to move from isolated pilots to orchestrated, autonomous manufacturing systems. About the Speaker Vlad Larichev is an Industrial AI Lead at Accenture Industry X, specializing in applying AI, generative AI, and agentic AI to engineering, manufacturing, and large-scale industrial operations. With a background as an engineer, solution architect, and software developer, he has led AI initiatives across sectors including automotive, energy, and consumer goods, integrating advanced analytics, computer vision, and simulation into complex industrial environments. Vlad is the creator of the Digital Reasoning Thread – a framework for connecting AI reasoning across visual intelligence, simulation, and physical execution. He is an active public speaker, podcast host, and community builder, sharing practical insights on scaling AI from pilot projects to enterprise-wide adoption. The Road to Useful Robots This talk explores the current state of AI-enabled robots and the issues with deploying more advanced models on constrained hardware, including limited compute and power budgets. It then moves on to what's next for developing useful, intelligent robots. About the Speaker Michael Hart, also known as Mike Likes Robots. is a robotics software engineer and content creator. His mission is to share knowledge to accelerate robotics. @mikelikesrobots |
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
|
|
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
2025-09-12 · 19:00
Join us for day three in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics. Date and Time Sept 12 at 9 AM Pacific Location Virtual. Register for the Zoom! Towards Robotics Foundation Models that Can Reason In recent years, we have witnessed remarkable progress in generative AI, particularly in language and visual understanding and generation. This leap has been fueled by unprecedentedly large image–text datasets and the scaling of large language and vision models trained on them. Increasingly, these advances are being leveraged to equip and empower robots with open-world visual understanding and reasoning capabilities. Yet, despite these advances, scaling such models for robotics remains challenging due to the scarcity of large-scale, high-quality robot interaction data, limiting their ability to generalize and truly reason about actions in the real world. Nonetheless, promising results are emerging from using multimodal large language models (MLLMs) as the backbone of robotic systems, especially in enabling the acquisition of low-level skills required for robust deployment in everyday household settings. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2’s flagship robotic foundation model for action reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks. About the Speaker Jiafei Duan is a Ph.D. candidate in Computer Science & Engineering at the University of Washington, advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in MIT Technology Review, GreekWire, VentureBeat, and Business Wire. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection In this talk, I will share our recent research efforts in visual industrial anomaly detection. It will present a comprehensive empirical analysis with a focus on real-world applications, demonstrating that recent SOTA methods perform worse than methods from 2021 when evaluated on a variety of datasets. We will also investigate how different practical aspects, such as input size, distribution shift, data contamination, and having a validation set, affect the results. About the Speaker Aimira Baitieva is a Research Engineer at Valeo, where she works primarily on computer vision problems. Her recent work has been focused on deep learning anomaly detection for automating visual inspection, incorporating both research and practical applications in the manufacturing sector. The Digital Reasoning Thread in Manufacturing: Orchestrating Vision, Simulation, and Robotics Manufacturing is entering a new phase where AI is no longer confined to isolated tasks like defect detection or predictive maintenance. Advances in reasoning AI, simulation, and robotics are converging to create end-to-end systems that can perceive, decide, and act – in both digital and physical environments. This talk introduces the Digital Reasoning Thread – a consistent layer of AI reasoning that runs through every stage of manufacturing, connecting visual intelligence, digital twins, simulation environments, and robotic execution. By linking perception with advanced reasoning and action, this approach enables faster, higher-quality decisions across the entire value chain. We will explore real-world examples of applying reasoning AI in industrial settings, combining simulation-driven analysis, orchestration frameworks, and the foundations needed for robotic execution in the physical world. Along the way, we will examine the key technical building blocks – from data pipelines and interoperability standards to agentic AI architectures – that make this level of integration possible. Attendees will gain a clear understanding of how to bridge AI-driven perception with simulation and robotics, and what it takes to move from isolated pilots to orchestrated, autonomous manufacturing systems. About the Speaker Vlad Larichev is an Industrial AI Lead at Accenture Industry X, specializing in applying AI, generative AI, and agentic AI to engineering, manufacturing, and large-scale industrial operations. With a background as an engineer, solution architect, and software developer, he has led AI initiatives across sectors including automotive, energy, and consumer goods, integrating advanced analytics, computer vision, and simulation into complex industrial environments. Vlad is the creator of the Digital Reasoning Thread – a framework for connecting AI reasoning across visual intelligence, simulation, and physical execution. He is an active public speaker, podcast host, and community builder, sharing practical insights on scaling AI from pilot projects to enterprise-wide adoption. The Road to Useful Robots This talk explores the current state of AI-enabled robots and the issues with deploying more advanced models on constrained hardware, including limited compute and power budgets. It then moves on to what's next for developing useful, intelligent robots. About the Speaker Michael Hart, also known as Mike Likes Robots. is a robotics software engineer and content creator. His mission is to share knowledge to accelerate robotics. @mikelikesrobots |
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
|
|
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
2025-09-12 · 19:00
Join us for day three in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics. Date and Time Sept 12 at 9 AM Pacific Location Virtual. Register for the Zoom! Towards Robotics Foundation Models that Can Reason In recent years, we have witnessed remarkable progress in generative AI, particularly in language and visual understanding and generation. This leap has been fueled by unprecedentedly large image–text datasets and the scaling of large language and vision models trained on them. Increasingly, these advances are being leveraged to equip and empower robots with open-world visual understanding and reasoning capabilities. Yet, despite these advances, scaling such models for robotics remains challenging due to the scarcity of large-scale, high-quality robot interaction data, limiting their ability to generalize and truly reason about actions in the real world. Nonetheless, promising results are emerging from using multimodal large language models (MLLMs) as the backbone of robotic systems, especially in enabling the acquisition of low-level skills required for robust deployment in everyday household settings. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2’s flagship robotic foundation model for action reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks. About the Speaker Jiafei Duan is a Ph.D. candidate in Computer Science & Engineering at the University of Washington, advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in MIT Technology Review, GreekWire, VentureBeat, and Business Wire. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection In this talk, I will share our recent research efforts in visual industrial anomaly detection. It will present a comprehensive empirical analysis with a focus on real-world applications, demonstrating that recent SOTA methods perform worse than methods from 2021 when evaluated on a variety of datasets. We will also investigate how different practical aspects, such as input size, distribution shift, data contamination, and having a validation set, affect the results. About the Speaker Aimira Baitieva is a Research Engineer at Valeo, where she works primarily on computer vision problems. Her recent work has been focused on deep learning anomaly detection for automating visual inspection, incorporating both research and practical applications in the manufacturing sector. The Digital Reasoning Thread in Manufacturing: Orchestrating Vision, Simulation, and Robotics Manufacturing is entering a new phase where AI is no longer confined to isolated tasks like defect detection or predictive maintenance. Advances in reasoning AI, simulation, and robotics are converging to create end-to-end systems that can perceive, decide, and act – in both digital and physical environments. This talk introduces the Digital Reasoning Thread – a consistent layer of AI reasoning that runs through every stage of manufacturing, connecting visual intelligence, digital twins, simulation environments, and robotic execution. By linking perception with advanced reasoning and action, this approach enables faster, higher-quality decisions across the entire value chain. We will explore real-world examples of applying reasoning AI in industrial settings, combining simulation-driven analysis, orchestration frameworks, and the foundations needed for robotic execution in the physical world. Along the way, we will examine the key technical building blocks – from data pipelines and interoperability standards to agentic AI architectures – that make this level of integration possible. Attendees will gain a clear understanding of how to bridge AI-driven perception with simulation and robotics, and what it takes to move from isolated pilots to orchestrated, autonomous manufacturing systems. About the Speaker Vlad Larichev is an Industrial AI Lead at Accenture Industry X, specializing in applying AI, generative AI, and agentic AI to engineering, manufacturing, and large-scale industrial operations. With a background as an engineer, solution architect, and software developer, he has led AI initiatives across sectors including automotive, energy, and consumer goods, integrating advanced analytics, computer vision, and simulation into complex industrial environments. Vlad is the creator of the Digital Reasoning Thread – a framework for connecting AI reasoning across visual intelligence, simulation, and physical execution. He is an active public speaker, podcast host, and community builder, sharing practical insights on scaling AI from pilot projects to enterprise-wide adoption. The Road to Useful Robots This talk explores the current state of AI-enabled robots and the issues with deploying more advanced models on constrained hardware, including limited compute and power budgets. It then moves on to what's next for developing useful, intelligent robots. About the Speaker Michael Hart, also known as Mike Likes Robots. is a robotics software engineer and content creator. His mission is to share knowledge to accelerate robotics. @mikelikesrobots |
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
|
|
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
2025-09-12 · 19:00
Join us for day three in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics. Date and Time Sept 12 at 9 AM Pacific Location Virtual. Register for the Zoom! Towards Robotics Foundation Models that Can Reason In recent years, we have witnessed remarkable progress in generative AI, particularly in language and visual understanding and generation. This leap has been fueled by unprecedentedly large image–text datasets and the scaling of large language and vision models trained on them. Increasingly, these advances are being leveraged to equip and empower robots with open-world visual understanding and reasoning capabilities. Yet, despite these advances, scaling such models for robotics remains challenging due to the scarcity of large-scale, high-quality robot interaction data, limiting their ability to generalize and truly reason about actions in the real world. Nonetheless, promising results are emerging from using multimodal large language models (MLLMs) as the backbone of robotic systems, especially in enabling the acquisition of low-level skills required for robust deployment in everyday household settings. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2’s flagship robotic foundation model for action reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks. About the Speaker Jiafei Duan is a Ph.D. candidate in Computer Science & Engineering at the University of Washington, advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in MIT Technology Review, GreekWire, VentureBeat, and Business Wire. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection In this talk, I will share our recent research efforts in visual industrial anomaly detection. It will present a comprehensive empirical analysis with a focus on real-world applications, demonstrating that recent SOTA methods perform worse than methods from 2021 when evaluated on a variety of datasets. We will also investigate how different practical aspects, such as input size, distribution shift, data contamination, and having a validation set, affect the results. About the Speaker Aimira Baitieva is a Research Engineer at Valeo, where she works primarily on computer vision problems. Her recent work has been focused on deep learning anomaly detection for automating visual inspection, incorporating both research and practical applications in the manufacturing sector. The Digital Reasoning Thread in Manufacturing: Orchestrating Vision, Simulation, and Robotics Manufacturing is entering a new phase where AI is no longer confined to isolated tasks like defect detection or predictive maintenance. Advances in reasoning AI, simulation, and robotics are converging to create end-to-end systems that can perceive, decide, and act – in both digital and physical environments. This talk introduces the Digital Reasoning Thread – a consistent layer of AI reasoning that runs through every stage of manufacturing, connecting visual intelligence, digital twins, simulation environments, and robotic execution. By linking perception with advanced reasoning and action, this approach enables faster, higher-quality decisions across the entire value chain. We will explore real-world examples of applying reasoning AI in industrial settings, combining simulation-driven analysis, orchestration frameworks, and the foundations needed for robotic execution in the physical world. Along the way, we will examine the key technical building blocks – from data pipelines and interoperability standards to agentic AI architectures – that make this level of integration possible. Attendees will gain a clear understanding of how to bridge AI-driven perception with simulation and robotics, and what it takes to move from isolated pilots to orchestrated, autonomous manufacturing systems. About the Speaker Vlad Larichev is an Industrial AI Lead at Accenture Industry X, specializing in applying AI, generative AI, and agentic AI to engineering, manufacturing, and large-scale industrial operations. With a background as an engineer, solution architect, and software developer, he has led AI initiatives across sectors including automotive, energy, and consumer goods, integrating advanced analytics, computer vision, and simulation into complex industrial environments. Vlad is the creator of the Digital Reasoning Thread – a framework for connecting AI reasoning across visual intelligence, simulation, and physical execution. He is an active public speaker, podcast host, and community builder, sharing practical insights on scaling AI from pilot projects to enterprise-wide adoption. The Road to Useful Robots This talk explores the current state of AI-enabled robots and the issues with deploying more advanced models on constrained hardware, including limited compute and power budgets. It then moves on to what's next for developing useful, intelligent robots. About the Speaker Michael Hart, also known as Mike Likes Robots. is a robotics software engineer and content creator. His mission is to share knowledge to accelerate robotics. @mikelikesrobots |
Sept 12 - Visual AI in Manufacturing and Robotics (Day 3)
|