Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

52

Filtering by: API ×

Top Speakers

Michael Armbrust 5 Ari Kaplan 4 Denny Lee 4 Fuat Can Efeoglu 4 Holly Smith 4 Jonathan Frankle 4 Justin DeBrabant 4 Michelle Leon 4 Youssef Mrini 4 Ajay GOLLAPALLI 3 Amit Pahwa 3 Arsalan Tavakoli-Shiraji 3

Sessions & talks

Showing 51–52 of 52 · Newest first

Search within this event →

Data Ingestion with Lakeflow Connect

2025-06-09 Watch

talk

AI/ML API Cloud Computing Databricks Delta ETL/ELT

In this course, you’ll learn how to have efficient data ingestion with Lakeflow Connect and manage that data. Topics include ingestion with built-in connectors for SaaS applications, databases and file sources, as well as ingestion from cloud object storage, and batch and streaming ingestion. We'll cover the new connector components, setting up the pipeline, validating the source and mapping to the destination for each type of connector. We'll also cover how to ingest data with Batch to Streaming ingestion into Delta tables, using the UI with Auto Loader, automating ETL with Lakeflow Declarative Pipelines or using the API.This will prepare you to deliver the high-quality, timely data required for AI-driven applications by enabling scalable, reliable, and real-time data ingestion pipelines. Whether you're supporting ML model training or powering real-time AI insights, these ingestion workflows form a critical foundation for successful AI implementation.Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.Labs: NoCertification Path: Databricks Certified Data Engineer Associate

Machine Learning at Scale

2025-06-09

talk

AI/ML API Databricks Pandas Python Spark

The course intends to equip professional-level machine learning practitioners with knowledge and hands-on experience in utilizing Apache Spark™ for machine learning purposes, including model fine-tuning. Additionally, the course covers using the Pandas library for scalable machine learning tasks. The initial section of the course focuses on comprehending the fundamentals of Apache Spark™ along with its machine learning capabilities. Subsequently, the second section delves into fine-tuning models using the hyperopt library. The final segment involves learning the implementation of the Pandas API within Apache Spark™, encompassing guidance on Pandas UDFs (User-Defined Functions) and the Functions API for model inference. Pre-requisites: Familiarity with Databricks workspace and notebooks; knowledge of machine learning model development and deployment with MLflow (e.g. basic understanding of DS/ML concepts, common model metrics and python libraries as well as a basic understanding of scaling workloads with Spark) Labs: Yes Certification Path: Databricks Certified Machine Learning Professional

Page 3 of 3

← Previous

1 2 3

talk-data.com

Data + AI Summit 2025

Top Topics

Top Speakers

Data Ingestion with Lakeflow Connect

Machine Learning at Scale