In this course, you’ll learn how to use the features Databricks provides for business intelligence needs: AI/BI Dashboards and AI/BI Genie. As a Databricks Data Analyst, you will be tasked with creating AI/BI Dashboards and AI/BI Genie Spaces within the platform, managing the access to these assets by stakeholders and necessary parties, and maintaining these assets as they are edited, refreshed, or decommissioned over the course of their lifespan. This course intends to instruct participants on how to design dashboards for business insights, share those with collaborators and stakeholders, and maintain those assets within the platform. Participants will also learn how to utilize AI/BI Genie Spaces to support self-service analytics through the creation and maintenance of these environments powered by the Databricks Data Intelligence Engine. Pre-requisites: The content was developed for participants with these skills/knowledge/abilities: A basic understanding of SQL for querying existing data tables in Databricks. Prior experience or basic familiarity with the Databricks Workspace UI. A basic understanding of the purpose and use of statistical analysis results. Familiarity with the concepts around dashboards used for business intelligence. Labs: Yes
talk-data.com
Topic
SQL
Structured Query Language (SQL)
89
tagged
Activity Trend
Top Events
In this course, you’ll learn how to define and schedule data pipelines that incrementally ingest and process data through multiple tables on the Data Intelligence Platform, using Lakeflow Declarative Pipelines in Spark SQL and Python. We’ll cover topics like how to get started with Lakeflow Declarative Pipelines, how Lakeflow Declarative Pipelines tracks data dependencies in data pipelines, how to configure and run data pipelines using the Lakeflow Declarative Pipelines. UI, how to use Python or Spark SQL to define data pipelines that ingest and process data through multiple tables on the Data Intelligence Platform, using Auto Loader and Lakeflow Declarative Pipelines, how to use APPLY CHANGES INTO syntax to process Change Data Capture feeds, and how to review event logs and data artifacts created by pipelines and troubleshoot syntax.By streamlining and automating reliable data ingestion and transformation workflows, this course equips you with the foundational data engineering skills needed to help kickstart AI use cases. Whether you're preparing high-quality training data or enabling real-time AI-driven insights, this course is a key step in advancing your AI journey.Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.)Labs: NoCertification Path: Databricks Certified Data Engineer Associate
In this course, you’ll learn how to apply patterns to securely store and delete personal information for data governance and compliance on the Data Intelligence Platform. We’ll cover topics like storing sensitive data appropriately to simplify granting access and processing deletes, processing deletes to ensure compliance with the right to be forgotten, performing data masking, and configuring fine-grained access control to configure appropriate privileges to sensitive data.Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Beginner experience with Lakeflow Declarative Pipelines and streaming workloads.Labs: YesCertification Path: Databricks Certified Data Engineer Professional
This course is designed for data professionals who want to explore the data warehousing capabilities of Databricks. Assuming no prior knowledge of Databricks, it provides an introduction to leveraging Databricks as a modern cloud-based data warehousing solution. Learners will explore how use the Databricks Data Intelligence Platform to ingest, transform, govern, and analyze data efficiently. Learners will also explore Genie, an innovative Databricks feature that simplifies data exploration through natural language queries. By the end of this course, participants will be equipped with the foundational skills to implement and optimize a data warehouse using Databricks. Pre-requisites: Basic understanding of SQL and data querying concepts General knowledge of data warehousing concepts, including tables, schemas, and ETL/ELT processes is recommended Some experience with BI and/or data visualization tools is helpful but not required Labs: Yes
In this course, you’ll learn how to optimize workloads and physical layout with Spark and Delta Lake and and analyze the Spark UI to assess performance and debug applications. We’ll cover topics like streaming, liquid clustering, data skipping, caching, photons, and more. Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Labs: Yes Certification Path: Databricks Certified Data Engineer Professional
In this course, you’ll learn how to Incrementally process data to power analytic insights with Structured Streaming and Auto Loader, and how to apply design patterns for designing workloads to perform ETL on the Data Intelligence Platform with Lakeflow Declarative Pipelines. First, we’ll cover topics including ingesting raw streaming data, enforcing data quality, implementing CDC, and exploring and tuning state information. Then, we’ll cover options to perform a streaming read on a source, requirements for end-to-end fault tolerance, options to perform a streaming write to a sink, and creating an aggregation and watermark on a streaming dataset. Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc.), intermediate programming experience with SQL and PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.). Beginner experience with streaming workloads and familiarity with Lakeflow Declarative Pipelines. Labs: No Certification Path: Databricks Certified Data Engineer Professional
In this course, you’ll learn how to have efficient data ingestion with Lakeflow Connect and manage that data. Topics include ingestion with built-in connectors for SaaS applications, databases and file sources, as well as ingestion from cloud object storage, and batch and streaming ingestion. We'll cover the new connector components, setting up the pipeline, validating the source and mapping to the destination for each type of connector. We'll also cover how to ingest data with Batch to Streaming ingestion into Delta tables, using the UI with Auto Loader, automating ETL with Lakeflow Declarative Pipelines or using the API.This will prepare you to deliver the high-quality, timely data required for AI-driven applications by enabling scalable, reliable, and real-time data ingestion pipelines. Whether you're supporting ML model training or powering real-time AI insights, these ingestion workflows form a critical foundation for successful AI implementation.Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.Labs: NoCertification Path: Databricks Certified Data Engineer Associate
This course offers a deep dive into designing data models within the Databricks Lakehouse environment, and understanding the data products lifecycle. Participants will learn to align business requirements with data organization and model design leveraging Delta Lake and Unity Catalog for defining data architectures, and techniques for data integration and sharing. Prerequisites: Foundational knowledge equivalent to Databricks Certified Data Engineer Associate and familiarity with many topics covered in Databricks Certified Data Engineer Professional. Experience with: Basic SQL queries and table creation on Databricks Lakehouse architecture fundamentals (medallion layers) Unity Catalog concepts (high-level) [Optional] Familiarity with data warehousing concepts (dimensional modeling, 3NF, etc.) is beneficial but not mandatory. Labs: Yes
In this course, you will learn basic skills that will allow you to use the Databricks Data Intelligence Platform to perform a simple data engineering workflow and support data warehousing endeavors. You will be given a tour of the workspace and be shown how to work with objects in Databricks such as catalogs, schemas, volumes, tables, compute clusters and notebooks. You will then follow a basic data engineering workflow to perform tasks such as creating and working with tables, ingesting data into Delta Lake, transforming data through the medallion architecture, and using Databricks Workflows to orchestrate data engineering tasks. You’ll also learn how Databricks supports data warehousing needs through the use of Databricks SQL, DLT, and Unity Catalog.