Search – talk-data.com

Title & Speakers	Event
Streaming Anxiety to Streaming Confidence: Hands-On Spark Structured Streaming 2026-03-25 · 19:00 Part 3 of a 4-part series From Streaming Anxiety to Streaming Confidence: Hands-On with Spark Structured Streaming Description: Working with real-time data is more accessible today than ever before, yet building your first streaming application can still feel daunting. In this session, Scott Haines will demystify streaming data and equip you with practical strategies to develop a streaming-first mindset. You'll learn how to: * Architect robust streaming data applications from the ground up * Build and test applications using PySpark with confidence * Simplify debugging within the Databricks ecosystem * Deploy, manage, and maintain streaming applications effectively * Apply proven tips and tricks to accelerate your streaming journey Walk away with the knowledge and tools to confidently tackle real-time data challenges and transform how you approach streaming applications.	Streaming Anxiety to Streaming Confidence: Hands-On Spark Structured Streaming
Delta Lake: The Definitive Guide 2024-10-31 Denny Lee – author , Scott Haines – author , Tristen Wentling – author , Prashanth Babu – author Ready to simplify the process of building data lakehouses and data pipelines at scale? In this practical guide, learn how Delta Lake is helping data engineers, data scientists, and data analysts overcome key data reliability challenges with modern data engineering and management techniques. Authors Denny Lee, Tristen Wentling, Scott Haines, and Prashanth Babu (with contributions from Delta Lake maintainer R. Tyler Croy) share expert insights on all things Delta Lake--including how to run batch and streaming jobs concurrently and accelerate the usability of your data. You'll also uncover how ACID transactions bring reliability to data lakehouses at scale. This book helps you: Understand key data reliability challenges and how Delta Lake solves them Explain the critical role of Delta transaction logs as a single source of truth Learn the Delta Lake ecosystem with technologies like Apache Flink, Kafka, and Trino Architect data lakehouses with the medallion architecture Optimize Delta Lake performance with features like deletion vectors and liquid clustering data data-engineering storage-repositories delta-lake Flink Data Engineering Delta Kafka Data Streaming Trino	O'Reilly Data Engineering Books
Spark Inception: Exploiting the Apache Spark REPL to Build Streaming Notebooks 2022-07-19 · 16:42 Scott Haines – Databricks Beacon @ Databricks Join Scott Haines (Databricks Beacon) as he teaches you to write your own Notebook style service (like Jupyter / Zeppelin / Databricks) for both fun (and profit?). Cause haven't we all just been a little curious how Notebook environments work? From the outside things probably seem magical, however just below the surface there is a literal world of possibilities waiting to be exploited (both figuratively and literally) to assist in the building of unimaginable new creations. Curiosity is of course the foundation for creativity and novel ideation, and when armed with the knowledge you'll pick up in this session, you'll have gained an additional perspective and way of thinking (mental model) for solving complex problems using dynamic procedural (on-the-fly) code compilation. Did I mention you'll use Spark Structured Streaming in order to generate a "live" communication channel between your Notebook service and the "outside world"? Overview During this session you'll learn to build your own Notebook-style service on top of Apache Spark & the Scala ILoop. Along the way, you'll uncover how to harness the SparkContext to manage, drive, and scale your own procedurally defined Apache Spark applications by mixing core configuration and other "magic". As we move through the steps necessary to achieve this end result, you'll learn to run individual paragraphs, or the entire synchronous waterfall of paragraphs, leading to the dynamic generation of applications. Deep dive into the world of possibilities that fork from a solid understanding of procedurally generated, on-the-fly, code compilation (live injection), the security ramifications (cause of course this is unsafe!), but come away with a new mental model focused on architecting composite applications, or auto-generated Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/ Databricks Scala Cyber Security Spark Data Streaming	Databricks DATA + AI Summit 2023 YouTube
The Rise of Operational Analytics 2019-12-25 Scott Haines – author Fast access to data has become a critical game changer. Today, a new breed of company understands that the faster they can build, access, and share well-defined datasets, the more competitive they’ll be in our data-driven world. In this practical report, Scott Haines from Twilio introduces you to operational analytics, a new approach for making sense of all the data flooding into business systems. Data architects and data scientists will see how Apache Kafka and other tools and processes laid the groundwork for fast analytics on a mix of historical and near-real-time data. You’ll learn how operational analytics feeds minute-by-minute customer interactions, and how NewSQL databases have entered the scene to drive machine learning algorithms, AI programs, and ongoing decision-making within an organization. Understand the key advantages that data-driven companies have over traditional businesses Explore the rise of operational analytics—and how this method relates to current tech trends Examine the impact of can’t wait business decisions and won’t wait customer experiences Discover how NewSQL databases support cloud native architecture and set the stage for operational databases Learn how to choose the right database to support operational analytics in your organization data data-engineering AI/ML Analytics Cloud Computing Kafka Twilio Segment	O'Reilly Data Engineering Books

Streaming Anxiety to Streaming Confidence: Hands-On Spark Structured Streaming 2026-03-25 · 19:00

Part 3 of a 4-part series

From Streaming Anxiety to Streaming Confidence: Hands-On with Spark Structured Streaming

Description: Working with real-time data is more accessible today than ever before, yet building your first streaming application can still feel daunting. In this session, Scott Haines will demystify streaming data and equip you with practical strategies to develop a streaming-first mindset. You'll learn how to:

* Architect robust streaming data applications from the ground up * Build and test applications using PySpark with confidence * Simplify debugging within the Databricks ecosystem * Deploy, manage, and maintain streaming applications effectively * Apply proven tips and tricks to accelerate your streaming journey

Walk away with the knowledge and tools to confidently tackle real-time data challenges and transform how you approach streaming applications.

Streaming Anxiety to Streaming Confidence: Hands-On Spark Structured Streaming

Delta Lake: The Definitive Guide 2024-10-31

Denny Lee – author , Scott Haines – author , Tristen Wentling – author , Prashanth Babu – author

Ready to simplify the process of building data lakehouses and data pipelines at scale? In this practical guide, learn how Delta Lake is helping data engineers, data scientists, and data analysts overcome key data reliability challenges with modern data engineering and management techniques. Authors Denny Lee, Tristen Wentling, Scott Haines, and Prashanth Babu (with contributions from Delta Lake maintainer R. Tyler Croy) share expert insights on all things Delta Lake--including how to run batch and streaming jobs concurrently and accelerate the usability of your data. You'll also uncover how ACID transactions bring reliability to data lakehouses at scale. This book helps you: Understand key data reliability challenges and how Delta Lake solves them Explain the critical role of Delta transaction logs as a single source of truth Learn the Delta Lake ecosystem with technologies like Apache Flink, Kafka, and Trino Architect data lakehouses with the medallion architecture Optimize Delta Lake performance with features like deletion vectors and liquid clustering

data data-engineering storage-repositories delta-lake Flink Data Engineering Delta Kafka Data Streaming Trino

O'Reilly Data Engineering Books

Spark Inception: Exploiting the Apache Spark REPL to Build Streaming Notebooks 2022-07-19 · 16:42

Scott Haines – Databricks Beacon @ Databricks

Join Scott Haines (Databricks Beacon) as he teaches you to write your own Notebook style service (like Jupyter / Zeppelin / Databricks) for both fun (and profit?). Cause haven't we all just been a little curious how Notebook environments work? From the outside things probably seem magical, however just below the surface there is a literal world of possibilities waiting to be exploited (both figuratively and literally) to assist in the building of unimaginable new creations. Curiosity is of course the foundation for creativity and novel ideation, and when armed with the knowledge you'll pick up in this session, you'll have gained an additional perspective and way of thinking (mental model) for solving complex problems using dynamic procedural (on-the-fly) code compilation.

Did I mention you'll use Spark Structured Streaming in order to generate a "live" communication channel between your Notebook service and the "outside world"?

Overview During this session you'll learn to build your own Notebook-style service on top of Apache Spark & the Scala ILoop. Along the way, you'll uncover how to harness the SparkContext to manage, drive, and scale your own procedurally defined Apache Spark applications by mixing core configuration and other "magic". As we move through the steps necessary to achieve this end result, you'll learn to run individual paragraphs, or the entire synchronous waterfall of paragraphs, leading to the dynamic generation of applications.

Deep dive into the world of possibilities that fork from a solid understanding of procedurally generated, on-the-fly, code compilation (live injection), the security ramifications (cause of course this is unsafe!), but come away with a new mental model focused on architecting composite applications, or auto-generated

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Databricks Scala Cyber Security Spark Data Streaming

Databricks DATA + AI Summit 2023

YouTube

The Rise of Operational Analytics 2019-12-25

Scott Haines – author

Fast access to data has become a critical game changer. Today, a new breed of company understands that the faster they can build, access, and share well-defined datasets, the more competitive they’ll be in our data-driven world. In this practical report, Scott Haines from Twilio introduces you to operational analytics, a new approach for making sense of all the data flooding into business systems. Data architects and data scientists will see how Apache Kafka and other tools and processes laid the groundwork for fast analytics on a mix of historical and near-real-time data. You’ll learn how operational analytics feeds minute-by-minute customer interactions, and how NewSQL databases have entered the scene to drive machine learning algorithms, AI programs, and ongoing decision-making within an organization. Understand the key advantages that data-driven companies have over traditional businesses Explore the rise of operational analytics—and how this method relates to current tech trends Examine the impact of can’t wait business decisions and won’t wait customer experiences Discover how NewSQL databases support cloud native architecture and set the stage for operational databases Learn how to choose the right database to support operational analytics in your organization

data data-engineering AI/ML Analytics Cloud Computing Kafka Twilio Segment

O'Reilly Data Engineering Books

talk-data.com

People (141 results)

Activities & events

From Streaming Anxiety to Streaming Confidence: Hands-On with Spark Structured Streaming