talk-data.com talk-data.com

Scott Haines

Speaker

Scott Haines

1

talks

Databricks Beacon Databricks

Scott Haines is a software engineer at Buf specializing in massive distributed data systems and streaming technologies. Over the past decade he has built and scaled data infrastructure at Yahoo!, Twilio, Nike, and now Buf, with deep expertise in distributed data architectures, streaming pipelines, Apache Spark, and Delta Lake. He is the author of books on Apache Spark and Delta Lake, and helps organizations adopt open-source technologies through teaching and consulting. He has over 20 years of experience across the software stack, with roles at Hitachi Data Systems, Convo Communications, Yahoo!, Twilio, and Nike.

Bio from: Databricks DATA + AI Summit 2023

Filtering by: Data + AI Summit 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 6 activities

Search activities →
The Hitchhiker's Guide to Delta Lake Streaming in an Agentic Universe

As data engineering continues to evolve the shift from batch-oriented to streaming-first has become standard across the enterprise. The reality is these changes have been taking shape for the past decade โ€” we just now also happen to be standing on the precipice of true disruption through automation, the likes of which we could only dream about before. Yes, AI Agents and LLMs are already a large part of our daily lives, but we (as data engineers) are ultimately on the frontlines ensuring that the future of AI is powered by consistent, just-in-time data โ€” and Delta Lake is critical to help us get there. This session will provide you with best practices learned the hard way by one of the authors of The Delta Lake Definitive Guide including: Guide to writing generic applications as components Workflow automation tips and tricks Tips and tricks for Delta clustering (liquid, z-order, and classic) Future facing: Leveraging metadata for agentic pipelines and workflow automation