talk-data.com

Topic

Data Streaming

realtime event_processing data_flow

Activities

tagged

Activity Trend

70 peak/qtr

2020-Q1 2026-Q2

Top Events

Data Engineering Podcast 227 O'Reilly Data Engineering Books 114 Databricks DATA + AI Summit 2023 70 Data + AI Summit 2025 64 How Music Charts 61 DATA MINER Big Data Europe Conference 2020 19 O'Reilly Data Science Books 18 Secrets of Data Analytics Leaders 12 AWS re:Invent 2024 12 Google Cloud Next '25 12 Big Data LDN 2025 10 Big Data LDN 2024 8

Top Speakers

Tobias Macey 227 Jason Joven (Chartmetric) 31 Rutger (Chartmetric) 23 Joe Reis (DeepLearning.AI) 6 Yingjun Wu (RisingWave Labs) 5 Mark Zandi (Moody's Analytics) 5 Cris deRitis 5 Marisa DiNatale (Moody's Analytics) 4 Nick Schrock (Elementl) 4 Denny Lee (Databricks) 4 Frank Munz (Databricks) 4 Someleze Diko (Microsoft) 4

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data + AI Summit 2025 ×

Databricks Data Privacy

2025-06-09 · Data + AI Summit 2025

talk

Data Governance Databricks Delta Git PySpark SQL

In this course, you’ll learn how to apply patterns to securely store and delete personal information for data governance and compliance on the Data Intelligence Platform. We’ll cover topics like storing sensitive data appropriately to simplify granting access and processing deletes, processing deletes to ensure compliance with the right to be forgotten, performing data masking, and configuring fine-grained access control to configure appropriate privileges to sensitive data.Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Beginner experience with Lakeflow Declarative Pipelines and streaming workloads.Labs: YesCertification Path: Databricks Certified Data Engineer Professional

Databricks Performance Optimization

2025-06-09 · Data + AI Summit 2025

talk

Databricks Delta Git PySpark Spark SQL

In this course, you’ll learn how to optimize workloads and physical layout with Spark and Delta Lake and and analyze the Spark UI to assess performance and debug applications. We’ll cover topics like streaming, liquid clustering, data skipping, caching, photons, and more. Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Labs: Yes Certification Path: Databricks Certified Data Engineer Professional

Databricks Streaming and Lakeflow Declarative Pipelines

2025-06-09 · Data + AI Summit 2025

talk

Data Quality Databricks Delta ETL/ELT Git PySpark SQL

In this course, you’ll learn how to Incrementally process data to power analytic insights with Structured Streaming and Auto Loader, and how to apply design patterns for designing workloads to perform ETL on the Data Intelligence Platform with Lakeflow Declarative Pipelines. First, we’ll cover topics including ingesting raw streaming data, enforcing data quality, implementing CDC, and exploring and tuning state information. Then, we’ll cover options to perform a streaming read on a source, requirements for end-to-end fault tolerance, options to perform a streaming write to a sink, and creating an aggregation and watermark on a streaming dataset. Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc.), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Beginner experience with streaming workloads and familiarity with Lakeflow Declarative Pipelines. Labs: No Certification Path: Databricks Certified Data Engineer Professional

Data Ingestion with Lakeflow Connect

2025-06-09 · Data + AI Summit 2025 Watch

talk

AI/ML API Cloud Computing Databricks Delta ETL/ELT Python SaaS Spark SQL

In this course, you’ll learn how to have efficient data ingestion with Lakeflow Connect and manage that data. Topics include ingestion with built-in connectors for SaaS applications, databases and file sources, as well as ingestion from cloud object storage, and batch and streaming ingestion. We'll cover the new connector components, setting up the pipeline, validating the source and mapping to the destination for each type of connector. We'll also cover how to ingest data with Batch to Streaming ingestion into Delta tables, using the UI with Auto Loader, automating ETL with Lakeflow Declarative Pipelines or using the API.This will prepare you to deliver the high-quality, timely data required for AI-driven applications by enabling scalable, reliable, and real-time data ingestion pipelines. Whether you're supporting ML model training or powering real-time AI insights, these ingestion workflows form a critical foundation for successful AI implementation.Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.Labs: NoCertification Path: Databricks Certified Data Engineer Associate

Page 4 of 4

← Previous

1 2 3 4