talk-data.com talk-data.com

Dustin Vannoy

Speaker

Dustin Vannoy

2

talks

Sr. Specialist Solutions Architect Databricks

Dustin Vannoy is a data engineer and solutions architect who helps solve business problems with analytics and big data solutions. He specializes in building data platforms and streaming data pipelines, guiding teams in migrating from legacy ETL to modern lakehouse architectures using cloud technologies. His focus includes Databricks, Apache Spark, Azure, Apache Kafka, Python, and Scala, and he co-founded the Data Engineering San Diego meetup to mentor others through talks and tutorials on YouTube.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
SQL-Based ETL: Options for SQL-Only Databricks Development

Using SQL for data transformation is a powerful way for an analytics team to create their own data pipelines. However, relying on SQL often comes with tradeoffs such as limited functionality, hard-to-maintain stored procedures or skipping best practices like version control and data tests. Databricks supports building high-performing SQL ETL workloads. Attend this session to hear how Databricks supports SQL for data transformation jobs as a core part of your Data Intelligence Platform. In this session we will cover 4 options to use Databricks with SQL syntax to create Delta tables: Lakeflow Declarative Pipelines: A declarative ETL option to simplify batch and streaming pipelines dbt: An open-source framework to apply engineering best practices to SQL based data transformations SQLMesh: an open-core product to easily build high-quality and high-performance data pipelines SQL notebooks jobs: a combination of Databricks Workflows and parameterized SQL notebooks

CI/CD for Databricks: Advanced Asset Bundles and GitHub Actions

This session is repeated.Databricks Asset Bundles (DABs) provide a way to use the command line to deploy and run a set of Databricks assets — like notebooks, Python code, Lakeflow Declarative Pipelines and workflows. To automate deployments, you create a deployment pipeline that uses the power of DABs along with other validation steps to ensure high quality deployments.In this session you will learn how to automate CI/CD processes for Databricks while following best practices to keep deployments easy to scale and maintain. After a brief explanation of why Databricks Asset Bundles are a good option for CI/CD, we will walk through a working project including advanced variables, target-specific overrides, linting, integration testing and automatic deployment upon code review approval. You will leave the session clear on how to build your first GitHub Action using DABs.ub Action using DABs.