talk-data.com talk-data.com

Colton Peltier

Speaker

Colton Peltier

3

talks

Staff Data Scientist Databricks

I am a Staff Data Scientist at Databricks in Professional Services helping customers develop and deploy cutting-edge ML solutions.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

3 activities · Newest first

Search activities →
GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

Enterprises generate massive amounts of unstructured data — from support tickets and PDFs to emails and product images. But extracting insight from that data requires brittle pipelines and complex tools. Databricks AI Functions make this simpler. In this session, you’ll learn how to apply powerful language and vision models directly within your SQL and ETL workflows — no endpoints, no infrastructure, no rewrites. We’ll explore practical use cases and best practices for analyzing complex documents, classifying issues, translating content, and inspecting images — all in a way that’s scalable, declarative, and secure. What you’ll learn: How to run state-of-the-art LLMs like GPT-4, Claude Sonnet 4, and Llama 4 on your data How to build scalable, multimodal ETL workflows for text and images Best practices for prompts, cost, and error handling in production Real-world examples of GenAI use cases powered by AI Functions

AT&T AutoClassify: Unified Multi-Head Binary Classification From Unlabeled Text

We present AT&T AutoClassify, built jointly between AT&T's Chief Data Office (CDO) and Databricks professional services, a novel end-to-end system for automatic multi-head binary classifications from unlabeled text data. Our approach automates the challenge of creating labeled datasets and training multi-head binary classifiers with minimal human intervention. Starting only from a corpus of unlabeled text and a list of desired labels, AT&T AutoClassify leverages advanced natural language processing techniques to automatically mine relevant examples from raw text, fine-tune embedding models and train individual classifier heads for multiple true/false labels. This solution can reduce LLM classification costs by 1,000x, making it an efficient solution in operational costs. The end result is a highly optimized and low-cost model servable in Databricks capable of taking raw text and producing multiple binary classifications. An example use case using call transcripts will be examined.

LLM in Practice: How to Productionize Your LLMs

Ask questions from a panel of data science experts who have deployed LLMs and AI models into production.

Talk by: David Talby, Conor Murphy, Cheng Yin Eng, Sam Raymond, and Colton Peltier

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc