RAG

From Days to Minutes - AI Transforms Audit at KPMG

2025-06-12 · Data + AI Summit 2025 Watch

talk

by David Tempelmann (Databricks) , Mark Wallington (KPMG UK)

AI/ML Databricks GenAI LLM MLOps

Imagine performing complex regulatory checks in minutes instead of days. We made this a reality using GenAI on the Databricks Data Intelligence Platform. Join us for a deep dive into our journey from POC to a production-ready AI audit tool. Discover how we automated thousands of legal requirement checks in annual reports with remarkable speed and accuracy. Learn our blueprint for: High-Performance AI: Building a scalable, >90% accurate AI system with an optimized RAG pipeline that auditors praise. Robust Productionization: Achieving secure, governed deployment using Unity Catalog, MLflow, LLM-based evaluation, and MLOps best practices. This session provides actionable insights for deploying impactful, compliant GenAI in the enterprise.

Sponsored by: IBM | How to leverage unstructured data to build more accurate, trustworthy AI agents

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

2025-06-12 · Data + AI Summit 2025 Watch

talk

by Amir Hormati (Databricks) , Alnur Ali (Databricks)

AI/ML BI GenAI LLM

Go beyond the user interface and explore the cutting-edge technology driving AI/BI Genie. This session breaks down the AI/BI Genie architecture, showcasing how LLMs, retrieval-augmented generation (RAG) and finely tuned knowledge bases work together to deliver fast, accurate responses. We’ll also explore how AI agents orchestrate workflows, optimize query performance and continuously refine their understanding. Ideal for those who want to geek out about the tech stack behind Genie, this session offers a rare look at the magic under the hood.

Sponsored by: Qubika | Agentic AI In Finance: How To Build Agents Using Databricks And LangGraph

PDF Document Ingestion Accelerator for GenAI Applications

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Qian Yu (Databricks)

Databricks GenAI Spark Data Streaming

Databricks Financial Service customers in the GenAI space have a common use case of ingestion and processing of unstructured documents — PDF/images — then performing downstream GenAI tasks such as entity extraction and RAG based knowledge Q&A. The pain points for the customers for these types of use cases are: The quality of the PDF/image documents varies since many older physical documents were scanned into electronic form The complexity of the PDF/image documents varies and many contain tables — images with embedding information — which require slower Tesseract OCR They would like to streamline postprocess for downstream workloads In this talk we will present an optimized structured streaming workflow for complex PDF ingestion. The key techniques include Apache Spark™ optimization, multi-threading, PDF object extraction, skew handling and auto retry logics

Sponsored by: EY | Unlocking Value Through AI at Takeda Pharmaceuticals

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Sid Taneja (Databricks) , Youngbin Kim (Databricks)

AI/ML Databricks GenAI LLM NLP SQL

This session is repeated. Integrating AI into existing data workflows can be challenging, often requiring specialized knowledge and complex infrastructure. In this session, we'll share how SQL users can leverage AI/ML to access large language models (LLMs) and traditional machine learning directly from within SQL, simplifying the process of incorporating AI into data workflows. We will demonstrate how to use Databricks SQL for natural language processing, traditional machine learning, retrieval augmented generation and more. You'll learn about best practices and see examples of solving common use cases such as opinion mining, sentiment analysis, forecasting and other common AI/ML tasks.

Moody's AI Screening Agent: Automating Compliance Decisions

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Nishant Gurunath (Moody's)

AI/ML LLM React

The AI Screening Agent automates Level 1 (L1) screening process, essential for Know Your Customer (KYC) and compliance due diligence during customer onboarding. This system aims to minimize false positives, significantly reducing human review time and costs. Beyond typical Retrieval-Augmented Generation (RAG) applications like summarization and chat-with-your-data (CWYD), the AI Screening Agent employs a ReAct architecture with intelligent tools, enabling it to perform complex compliance decision-making with human-like accuracy and greater consistency. In this talk, I will explore the screening agent architecture, demonstrating its ability to meet evolving client policies. I will discuss evaluation and configuration management using MLflow LLM-as-judge and Unity Catalog, and discuss challenges, such as, data fidelity and customization. This session underscores the transformative potential of AI agents in compliance workflows, emphasizing their adaptability, accuracy, and consistency.

Gen AI Deployment and Monitoring

2025-06-10 · Data + AI Summit 2025

talk

AI/ML Data Lakehouse Databricks GenAI Vector DB

This course introduces learners to deploying, operationalizing, and monitoring generative artificial intelligence (AI) applications. First, learners will develop knowledge and skills in deploying generative AI applications using tools like Model Serving. Next, the course will discuss operationalizing generative AI applications following modern LLMOps best practices and recommended architectures. Finally, learners will be introduced to the idea of monitoring generative AI applications and their components using Lakehouse Monitoring. Pre-requisites: Familiarity with prompt engineering and retrieval-augmented generation (RAG) techniques, including data preparation, embeddings, vectors, and vector databases. A foundational knowledge of Databricks Data Intelligence Platform tools for evaluation and governance (particularly Unity Catalog). Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Nic Heier (Databricks) , Justin DeBrabant (Databricks)

AI/ML Databricks

This session is repeated. In this session, we present an overview of the GA release of Databricks Apps, the new app hosting platform that integrates all the Databricks services necessary to build production-ready data and AI applications. With Apps, data and developer teams can build new interfaces into the data intelligence platform, further democratizing the transformative power of data and AI across the organization. We'll cover common use cases, including RAG chat apps, interactive visualizations and custom workflow builders, as well as look at several best practices and design patterns when building apps. Finally, we'll look ahead with the vision, strategy and roadmap for the year ahead.

Sponsored by: Neo4j | Get Your Data AI-Ready: Knowledge Graphs & GraphRAG for GenAI Success

Beyond Simple RAG: Unlocking Quality, Scale and Cost-Efficient Retrieval With Mosaic AI Vector Search

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Ankit Vij (Databricks) , Adam Gurary (Databricks)

AI/ML Databricks

This session is repeated. Mosaic AI Vector Search is powering high-accuracy retrieval systems in production across a wide range of use cases — including RAG applications, entity resolution, recommendation systems and search. Fully integrated with the Databricks Data Intelligence Platform, it eliminates pipeline maintenance by automatically syncing data from source to index. Over the past year, customers have asked for greater scale, better quality out-of-the-box and cost-efficient performance. This session delivers on those needs — showcasing best practices for implementing high-quality retrieval systems and revealing major product advancements that improve scalability, efficiency and relevance. What you’ll learn: How to optimize Vector Search with hybrid retrieval and reranking for better out-of-the-box results Best practices for managing vector indexes with minimal operational overhead Real-world examples of how organizations have scaled and improved their search and recommendation systems

Sponsored by: Firebolt | The Power of Low-latency Data for AI Apps

Composing High-Accuracy AI Systems With SLMs and Mini-Agents

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Sharon Zhou (Lamini)

AI/ML LLM SQL

This session is repeated. For most companies, building compound AI systems remains aspirational. LLMs are powerful, but imperfect, and their non-deterministic nature makes steering them to high accuracy a challenge. In this session, we’ll demonstrate how to build compound AI systems using SLMs and highly accurate mini-agents that can be integrated into agentic workflows. You'll learn about breakthrough techniques, including: memory RAG, an embedding algorithm that reduces hallucinations using embed-time compute to generate contextual embeddings, improving indexing and retrieval, and memory tuning, a finetuning algorithm that reduces hallucinations using a Mixture of Memory Experts (MoME) to specialize models with proprietary data. We’ll also share real-world examples (text-to-SQL, factual reasoning, function calling, code analysis and more) across various industries. With these building blocks, we’ll demonstrate how to create high accuracy mini-agents that can be composed into larger AI systems.

Advanced RAG Overview — Thawing Your Frozen RAG Pipeline

2025-06-10 · Data + AI Summit 2025 Watch

talk

by James Lin (Experian) , Jason Li (Experian)

Data Lakehouse Databricks

The most common RAG systems rely on a frozen RAG system — one where there’s a single embedding model and single vector index. We’ve achieved a modicum of success with that, but when it comes to increasing accuracy for production systems there is only so much this approach solves. In this session we will explore how to move from the frozen systems to adaptive RAG systems which produce more tailored outputs with higher accuracy. Databricks services: Lakehouse, Unity Catalog, Mosaic, Sweeps, Vector Search, Agent Evaluation, Managed Evaluation, Inference Tables

Gen AI Evaluation and Governance

2025-06-10 · Data + AI Summit 2025

talk

AI/ML Databricks GenAI Cyber Security Vector DB

This course introduces learners to evaluating and governing GenAI (generative artificial intelligence) systems. First, learners will explore the meaning behind and motivation for building evaluation and governance/security systems. Next, the course will connect evaluation and governance systems to the Databricks Data Intelligence Platform. Third, learners will be introduced to a variety of evaluation techniques for specific components and types of applications. Finally, the course will conclude with an analysis of evaluating entire AI systems with respect to performance and cost. Pre-requisites: Familiarity with prompt engineering, and experience with the Databricks Data Intelligence Platform. Additionally, knowledge of retrieval-augmented generation (RAG) techniques including data preparation, embeddings, vectors, and vector databases Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

Gen AI Application Development

2025-06-09 · Data + AI Summit 2025

talk

AI/ML Databricks GenAI LLM NLP Vector DB

This course provides participants with information and practical experience in building advanced LLM (Large Language Model) applications using multi-stage reasoning LLM chains and agents. In the initial section, participants will learn how to decompose a problem into its components and select the most suitable model for each step to enhance business use cases. Following this, participants will construct a multi-stage reasoning chain utilizing LangChain and HuggingFace transformers. Finally, participants will be introduced to agents and will design an autonomous agent using generative models on Databricks. Pre-requisites: Solid understanding of natural language processing (NLP) concepts, familiarity with prompt engineering and prompt engineering best practices, experience with the Databricks Data Intelligence Platform, experience with retrieval-augmented generation (RAG) techniques including data preparation, building RAG architectures, and concepts like embeddings, vectors, and vector databases Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

Gen AI Solution Development

2025-06-09 · Data + AI Summit 2025

talk

AI/ML Databricks GenAI Vector DB

This course is designed to introduce participants to contextual GenAI (generative artificial intelligence) solutions using the retrieval-augmented generation (RAG) method. Firstly, participants will be introduced to the RAG architecture and the significance of contextual information using Mosaic AI Playground. Next, the course will demonstrate how to prepare data for GenAI solutions and connect this process with building an RAG architecture. Finally, participants will explore concepts related to context embedding, vectors, vector databases, and the utilization of the Mosaic AI Vector Search product. Pre-requisites: Familiarity with embeddings, prompt engineering best practices, and experience with the Databricks Data Intelligence Platform Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

talk-data.com

Activity Trend

Top Events

Top Speakers

From Days to Minutes - AI Transforms Audit at KPMG

Sponsored by: IBM | How to leverage unstructured data to build more accurate, trustworthy AI agents

AI/BI Genie: A Look Under the Hood of Everyone's Friendly, Neighborhood GenAI Product

Sponsored by: Qubika | Agentic AI In Finance: How To Build Agents Using Databricks And LangGraph

PDF Document Ingestion Accelerator for GenAI Applications

Sponsored by: EY | Unlocking Value Through AI at Takeda Pharmaceuticals

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

Moody's AI Screening Agent: Automating Compliance Decisions

Gen AI Deployment and Monitoring

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

Sponsored by: Neo4j | Get Your Data AI-Ready: Knowledge Graphs & GraphRAG for GenAI Success

Beyond Simple RAG: Unlocking Quality, Scale and Cost-Efficient Retrieval With Mosaic AI Vector Search

Sponsored by: Firebolt | The Power of Low-latency Data for AI Apps

Sponsored by: Monte Carlo | The Illusion of Done: Why the Real Work for AI Starts in Production

Composing High-Accuracy AI Systems With SLMs and Mini-Agents

Advanced RAG Overview — Thawing Your Frozen RAG Pipeline

Gen AI Evaluation and Governance

Gen AI Application Development

Gen AI Solution Development