talk-data.com talk-data.com

YouTube 2022-07-19 at 16:39

Moving to the Lakehouse: Fast & Efficient Ingestion with Auto Loader

Description

Auto loader, the most popular tool for incremental data ingestion from cloud storage to Databricks’ Lakehouse, is used in our biggest customers’ ingestion workflows. Auto Loader is our all-in-one solution for exactly-once processing offering efficient file discovery, schema inference and evolution, and fault tolerance.

In this talk, we want to delve into key features in Auto Loader, including: • Avro schema inference • Rescued column • Semi-structured data support • Incremental listing • Asynchronous backfilling • Native listing • File-level tracking and observability

Auto Loader is also used in other Databricks features such as Delta Live Tables. We will discuss the architecture, provide a demo, and feature an Auto Loader customer speaking about their experience migrating to Auto Loader.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/