talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

715

Sessions & talks

Showing 176–200 of 715 · Newest first

Search within this event →
MLOps That Ships: Accelerating AI Deployment at Vizient

MLOps That Ships: Accelerating AI Deployment at Vizient

2025-06-11 Watch
talk
Adam Hasham (Vizient) , Radhakrishnan,Ram Radhakrishnan (Vizient Inc.)

Deploying AI models efficiently and consistently is a challenge many organizations face. This session will explore how Vizient built a standardized MLOps stack using Databricks and Azure DevOps to streamline model development, deployment and monitoring. Attendees will gain insights into how Databricks Asset Bundles were leveraged to create reproducible, scalable pipelines and how Infrastructure-as-Code principles accelerated onboarding for new AI projects. The talk will cover: End-to-end MLOps stack setup, ensuring efficiency and governance CI/CD pipeline architecture, automating model versioning and deployment Standardizing AI model repositories, reducing development and deployment time Lessons learned, including challenges and best practices By the end of this session, participants will have a roadmap for implementing a scalable, reusable MLOps framework that enhances operational efficiency across AI initiatives.

Scaling Data Engineering Pipelines: Preparing Credit Card Transactions Data for Machine Learning

Scaling Data Engineering Pipelines: Preparing Credit Card Transactions Data for Machine Learning

2025-06-11 Watch
talk
Luke Garzia (Mastercard) , Brandon DeShon (Mastercard)

We discuss two real-world use cases in big data engineering, focusing on constructing stable pipelines and managing storage at a petabyte scale. The first use case highlights the implementation of Delta Lake to optimize data pipelines, resulting in an 80% reduction in query time and a 70% reduction in storage space. The second use case demonstrates the effectiveness of the Workflows ‘ForEach’ operator in executing compute-intensive pipelines across multiple clusters, significantly reducing processing time from months to days. This approach involves a reusable design pattern that isolates notebooks into units of work, enabling data scientists to independently test and develop.

Scaling Success: How Banks are Unlocking Growth With Data and AI

Scaling Success: How Banks are Unlocking Growth With Data and AI

2025-06-11 Watch
talk
Tony Qui (EY) , David Sabow (HSBC) , Ricardo Portilla (Databricks) , Felipe Cobucci (PicPay) , Chris D’Agostino (FIS Global)

Growth in banking isn’t just about keeping pace—it’s about setting the pace. This session explores how leading banks leverage Databricks’ Data Intelligence Platform to uncover new revenue opportunities, deepen customer relationships, and expand market reach. Hear from industry leaders who have transformed their growth strategies by harnessing the power of advanced analytics and machine learning. Learn how personalized customer experiences, predictive insights and unified data platforms are driving innovation and helping banks scale faster than ever. Key takeaways: Proven strategies for identifying untapped growth opportunities using data-driven approaches Real-world examples of banks creating personalized customer journeys that boost retention and loyalty Tools and techniques to accelerate innovation while maintaining operational efficiency Join us in discovering how data intelligence is redefining growth in banking and thriving throughout uncertainty.

Schiphol Group’s Transformation to Unity Catalog

Schiphol Group’s Transformation to Unity Catalog

2025-06-11 Watch
talk
Suvadeep Sinha (Databricks) , Jelle Katsman (Royal Schiphol Group) , Shasidhar Eranti (Databricks)

Discover how Europe’s third-busiest airport, Schiphol Group, is elevating its data operations by transitioning from a standard Databricks setup to the advanced capabilities of Unity Catalog. In this session, we will share the motivations, obstacles and strategic decisions behind executing a seamless migration in a large-scale environment — one that spans hundreds of workspaces and demands continuous availability. Gain insights into planning and governance, learn how to safeguard data integrity and maintain operational flow, and understand the process of integrating Unity Catalog’s enhanced security and governance features. Attendees will leave with practical lessons from our hands-on experience, proven methods for similar migrations, and a clear perspective on the benefits this transition offers for complex, rapidly evolving organizations.

Sponsored by: Accenture & Avanade | How data strategy powers mission-critical work at the Gates Foundation

Sponsored by: Accenture & Avanade | How data strategy powers mission-critical work at the Gates Foundation

2025-06-11 Watch
talk
Brice Jaggars (Avanade) , Thushan Wijesinghe (Gates Foundation)

There’s never been a more critical time to ensure data and analytics foundations can deliver the value and efficiency needed to accelerate and scale AI. What are the most difficult challenges that organizations face with data transformation, and what technologies, processes and decisions that overcome these barriers to success? Join this session featuring executives from the Gates Foundation, the nonprofit leading change in communities around the globe, and Avanade, the joint venture between Accenture and Microsoft, in a discussion about impactful data strategy. Learn about the Gates Foundation’s approach to its enterprise data platform to ensure trusted insights at the speed of today’s business. And we’ll share lessons learned from Avanade helping organizations around the globe build with Databricks and seize the AI opportunity.

Sponsored by: Deloitte | Analyzing Geospatial Data at Scale in Databricks for Environment & Agriculture

Sponsored by: Deloitte | Analyzing Geospatial Data at Scale in Databricks for Environment & Agriculture

2025-06-11 Watch
talk
Luke Teacy (Deloitte)

Analyzing geospatial data has become a cornerstone of tackling many of today’s pressing challenges from climate change to resource management. However, storing and processing such data can be complex and hard to scale using common GIS packages. This talk explores how Deloitte and Databricks enable horizontally scalable geospatial analysis using delta lake, H3 integration and support for geospatial vector and raster data. We demonstrate how we have leveraged these capabilities for real-world applications in environmental monitoring and agriculture. In doing so, we cover end-to-end processing from ingestion, transformation and analysis to production of geospatial data products accessible by scientists and decision makers through standard GIS tools.

Sponsored by: KPMG | Enhancing Regulatory Compliance through Data Quality and Traceability

Sponsored by: KPMG | Enhancing Regulatory Compliance through Data Quality and Traceability

2025-06-11 Watch
talk
Thomas Haslam (KPMG)

In highly regulated industries like financial services, maintaining data quality is an ongoing challenge. Reactive measures often fail to prevent regulatory penalties, causing inaccuracies in reporting and inefficiencies due to poor data visibility. Regulators closely examine the origins and accuracy of reporting calculations to ensure compliance. A robust system for data quality and lineage is crucial. Organizations are utilizing Databricks to proactively improve data quality through rules-based and AI/ML-driven methods. This fosters complete visibility across IT, data management, and business operations, facilitating rapid issue resolution and continuous data quality enhancement. The outcome is quicker, more accurate, transparent financial reporting. We will detail a framework for data observability and offer practical examples of implementing quality checks throughout the data lifecycle, specifically focusing on creating data pipelines for regulatory reporting,

Sponsored by: LTIMindtree | 4 Strategies to Maximize SAP Data Value with Databricks and AI

Sponsored by: LTIMindtree | 4 Strategies to Maximize SAP Data Value with Databricks and AI

2025-06-11 Watch
talk
Benjamin Mathew (Databricks) , Manas Ranjan Nayak (LTIMindtree)

As enterprises strive to become more data-driven, SAP continues to be central to their operational backbone. However, traditional SAP ecosystems often limit the potential of AI and advanced analytics due to fragmented architectures and legacy tools. In this session, we explore four strategic options for unlocking greater value from SAP data by integrating with Databricks and cloud-native platforms. Whether you're on ECC, S4HANA, or transitioning from BW, learn how to modernize your data landscape, enable real-time insights, and power AI/ML at scale. Discover how SAP Business Data Cloud and SAP Databricks can help you build a unified, future-ready data and analytics ecosystem—without compromising on scalability, flexibility, or cost-efficiency.

Sponsored by: Monte Carlo | Cleared for Takeoff: How American Airlines Builds Data Trust

Sponsored by: Monte Carlo | Cleared for Takeoff: How American Airlines Builds Data Trust

2025-06-11 Watch
talk
Andrew Machen (American Airlines) , Shane Murray (Monte Carlo)

American Airlines, one of the largest airlines in the world, processes a tremendous amount of data every single minute. With a data estate of this scale, accountability for the data goes beyond the data team; the business organization has to be equally invested in championing the quality, reliability, and governance of data. In this session, Andrew Machen, Senior Manager, Data Engineering at American Airlines will share how his team maximizes resources to deliver reliable data at scale. He'll also outline his strategy for aligning business leadership with an investment in data reliability, and how leveraging Monte Carlo's data + AI observability platform enabled them to reduce time spent resolving data reliability issues from 10 weeks to 2 days, saving millions of dollars and driving valuable trust in the data.

Stop Guessing Spend Where It Counts: Data-Driven Decisions for High-Impact Investments on Databricks

Stop Guessing Spend Where It Counts: Data-Driven Decisions for High-Impact Investments on Databricks

2025-06-11 Watch
talk
Clara MacAvoy (Databricks) , Bruce Wong (Databricks)

Struggling with runaway cloud costs as your organization grows? Join us for an inside look at how Databricks’ own Data Platform team tackled escalating spend in some of the world’s largest workspaces — saving millions of dollars without sacrificing performance or user experience. We’ll share how we harnessed powerful features like System Tables, Workflows, Unity Catalog, and Photon to monitor and optimize resource usage, all while using data-driven decisions to improve efficiency and ensure we invest in the areas that truly drive business impact. You’ll hear about the real-world challenges we faced balancing governance with velocity and discover the custom tooling and best practices we developed to keep costs in check. By the end of this session, you’ll walk away with a proven roadmap for leveraging Databricks to control cloud spend at scale.

The AI Regulation Dilemma: Spur Innovation, or Guardrails? — Where Are We and the Impact of Trump 2

The AI Regulation Dilemma: Spur Innovation, or Guardrails? — Where Are We and the Impact of Trump 2

2025-06-11 Watch
talk
Scott Starbird (Databricks)

The Trump 2 AI agenda prioritizes US AI leadership by opposing AI regulation on bias and frontier AI risks, favoring innovation and AI expansion. With comprehensive federal AI regulation unlikely, states are advancing AI laws addressing bias, harmful content, transparency, frontier model risk and other risks. Meanwhile, the EU AI Act effectively imposes global obligations. The emerging patchwork of state rules will burden US companies more than would a unified federal approach, seemingly undermining White House deregulatory goals. So, ironically, the Trump team AI agenda may accelerate disparate state-level regulation and impede AI innovation. US companies therefore face a fragmented landscape similar to privacy regulation where the EU AI Act — in the role of GDPR — has set the stage, and the states are asserting themselves with various incremental requirements. Other recent developments covered will include the finalization of the EU GPAI Code of Practice, certain newly enacted state laws, and a quick overview of AI regulation outside the U.S. and EU.

The Full Stack of Innovation: Building Data and AI Products With Databricks Apps

The Full Stack of Innovation: Building Data and AI Products With Databricks Apps

2025-06-11 Watch
talk
Giran Moodley (Databricks) , Ivan Trusov (Databricks)

In this deep-dive technical session, Ivan Trusov (Sr. SSA @ Databricks) and Giran Moodley (SA @ Databricks) — will explore the full-stack development of Databricks Apps, covering everything from frameworks to deployment. We’ll walk through essential topics, including: Frameworks & tooling — Pythonic (Dash, Streamlit, Gradio) vs. JS + Python stack Development lifecycle — Debugging, issue resolution and best practices Testing — Unit, integration and load testing strategies CI/CD & deployment — Automating with Databricks Asset Bundles Monitoring & observability — OpenTelemetry, metrics collection and analysis Expect a highly practical session with several live demos, showcasing the development loop, testing workflows and CI/CD automation. Whether you’re building internal tools or AI-powered products, this talk will equip you with the knowledge to ship robust, scalable Databricks Apps.

Use External Models in Databricks: Connecting to Azure, AWS, Google Cloud, Anthropic and More

Use External Models in Databricks: Connecting to Azure, AWS, Google Cloud, Anthropic and More

2025-06-11 Watch
talk
Ina Koleva (Databricks)

In this session you will learn how to leverage a wide set of GenAI models in Databricks, including external connections to cloud vendors and other model providers. We will cover establishing connection to externally served models, via Mosaic AI Gateway. This will showcase connection to Azure, AWS & Google Cloud models, as well as model vendors like Anthropic, Cohere, AI21 Labs and more. You will also discover best practices on model comparison, governance and cost control on those model deployments.

Summit Live: OLTP for the Lakehouse

Summit Live: OLTP for the Lakehouse

2025-06-11 Watch
talk
Dave Nettleton (Databricks)

Analytical and operational use cases are starting to converge, and AI-assisted applications are accelerating the trend. Most applications require a transactional, OLTP database to power data. Hear from a Databricks expert on the latest developments and our strategy for operational data integrated into the lakehouse.

Building AI models of human cell: Tahoe Therapeutics on Databricks

Building AI models of human cell: Tahoe Therapeutics on Databricks

2025-06-11 Watch
lightning_talk
Nima Alidoust (Tahoe Therapeutics)

Discover how Tahoe Therapeutics (formerly Vevo) is generating gigascale single-cell data that map how drugs interact with cells from cancer patients. They are using that to find better therapeutics, and to build AI models that can predict drug-patient interactions on Databricks. Their technology enabled the landmark Tahoe-100M atlas, the world’s largest dataset of drug responses-profiling 100 million cells across 60,000 conditions. Learn how we use Databricks to process this massive data, enabling AI models that predict drug efficacy and resistance at the cellular level. Recognized as the Grand Prize Winner of the Databricks Generative AI Startup Challenge, Tahoe sets a new standard for scalable, data-driven drug discovery.

Exploring Data and AI With Databricks Community Edition

Exploring Data and AI With Databricks Community Edition

2025-06-11 Watch
lightning_talk
Will Valori (Databricks)

Join this session to see how you can get started with data and AI using Databricks Community Edition—free, and built for learners like you. You’ll get a first look at a unified environment where you can work with professional-grade tools to load and explore data, build notebooks, and train simple models.

Fine-Grained Access Control for Unstructured Data With Volume Permissions

Fine-Grained Access Control for Unstructured Data With Volume Permissions

2025-06-11 Watch
lightning_talk
Adrian Ionescu (Databricks) , Lianne Zelsman (Databricks)

Unstructured data — images, documents, videos and more — is growing in importance with AI and ML. Yet managing access control at scale is challenging. Unity Catalog Volumes offer a secure foundation, but access control has remained volume-level until now. This session introduces Volume Path Permissions, a new feature enabling fine-grained access within volumes. Expanding on Unity Catalog’s robust permission model, they let you grant privileges to users and groups based on path prefixes. We’ll cover the governance model, share examples and demonstrate how to enforce least-privilege access. By the end, you’ll know how to manage file-level access with Unity Catalog’s flexibility and control.

Generative AI Merchant Matching

Generative AI Merchant Matching

2025-06-11 Watch
lightning_talk
Tomáš Drietomský (Mastercard)

Our project demonstrates building enterprise AI systems cost-effectively, focusing on matching merchant descriptors to known businesses. Using fine-tuned LLMs and advanced search, we created a solution rivaling alternatives at minimal cost. The system works in three steps: A fine-tuned Llama 3 8B model parses merchant descriptors into standardized components. A hybrid search system uses these components to find candidate matches in our database. A Llama 3 70B model then evaluates top candidates, with an AI judge reviewing results for hallucination. We achieved a 400% latency improvement while maintaining accuracy and keeping costs low and each fine-tuning round cost hundreds of dollars. Through careful optimization and simple architecture for a balance between cost, speed and accuracy, we show that small teams with modest budgets can tackle complex problems effectively using this technology. We share key insights on prompt engineering, fine-tuning and cost and latency management.

How Serverless Empowered Nationwide to Build Cost-Efficient and World Class BI

How Serverless Empowered Nationwide to Build Cost-Efficient and World Class BI

2025-06-11 Watch
lightning_talk
Ananya Ghosh (Nationwide)

Databricks’ Serverless compute streamlines infrastructure setup and management, delivering unparalleled performance and cost optimization for Data and BI workflows. In this presentation, we will explore how Nationwide is leveraging Databricks’ serverless technology and unified governance through Unity Catalog to build scalable, world-class BI solutions. Key features like AI/BI Dashboards, Genie, Materialized Views, Lakehouse Federation and Lakehouse Apps, all powered by serverless, have empowered business teams to deliver faster, scalable and smarter insights. We will show how Databricks’ serverless technology is enabling Nationwide to unlock new levels of efficiency and business impact, and how other organizations can adopt serverless technology to realize similar benefits.

Race to Real-Time: Low-Latency Streaming ETL Meets Next-Gen Databricks OLTP-DB

Race to Real-Time: Low-Latency Streaming ETL Meets Next-Gen Databricks OLTP-DB

2025-06-11 Watch
lightning_talk
Irfan Elahi (Databricks)

In today’s digital economy, real-time insights and rapid responsiveness are paramount to delivering exceptional user experiences and lowering TCO. In this session, discover a pioneering approach that leverages a low-latency streaming ETL pipeline built with Spark Structured Streaming and Databricks’ new OLTP-DB—a serverless, managed Postgres offering designed for transactional workloads. Validated in a live customer scenario, this architecture achieves sub-2 second end-to-end latency by seamlessly ingesting streaming data from Kinesis and merging it into OLTP-DB. This breakthrough not only enhances performance and scalability but also provides a replicable blueprint for transforming data pipelines across various verticals. Join us as we delve into the advanced optimization techniques and best practices that underpin this innovation, demonstrating how Databricks’ next-generation solutions can revolutionize real-time data processing and unlock a myriad of new use cases in data landscape.

Spark Right-Sizing: Saving Thousands of PBHrs of Compute at LinkedIn

Spark Right-Sizing: Saving Thousands of PBHrs of Compute at LinkedIn

2025-06-11 Watch
lightning_talk
Shreyesh Arangath (LinkedIn)

At LinkedIn, we manage over 400,000 daily Spark applications consuming 200+ PBHrs of compute daily. To address the challenges posed by manual configuration of Spark's memory tuning options, which led to low memory utilization and frequent OOM errors, we developed an automated Spark executor memory right-sizing system. Our approach, utilizing a policy-based system with nearline and real-time feedback loops, automates memory tuning, leading to more efficient resource allocation, improved user productivity and increased job reliability. By leveraging historical data and real-time error classification, we dynamically adjust memory, significantly narrowing the gap between allocated and utilized resources while reducing failures. This initiative has achieved a 13% increase in memory utilization and a 90% drop in OOM-related job failures, saving us 1000s of PBHrs of compute every year.

Sponsored by: Cognizant | How Cognizant Helped RJR Transform Market Intelligence with GenAI

Sponsored by: Cognizant | How Cognizant Helped RJR Transform Market Intelligence with GenAI

2025-06-11 Watch
lightning_talk
Vijay Gandapodi (Reynolds American)

Cognizant developed a GenAI-driven market intelligence chatbot for RJR using Dash UI. This chatbot leverages Databricks Vector Search for vector embeddings and semantic search, along with the DBRX-Instruct LLM model to provide accurate and contextually relevant responses to user queries. The implementation involved loading prepared metadata into a Databricks vector database using the GTE model to create vector embeddings, indexing these embeddings for efficient semantic search, and integrating the DBRX-Instruct LLM into the chat system with prompts to guide the LLM in understanding and responding to user queries. The chatbot also generated responses containing URL links to dashboards with requested numerical values, enhancing user experience and productivity by reducing report navigation and discovery time by 30%. This project stands out due to its innovative AI application, advanced reasoning techniques, user-friendly interface, and seamless integration with MicroStrategy.

Sponsored by: Infosys | Beyond Hype: Scale & Democratize Agentic AI across enterprise to realize business outcomes.

Sponsored by: Infosys | Beyond Hype: Scale & Democratize Agentic AI across enterprise to realize business outcomes.

2025-06-11 Watch
lightning_talk
Rajan Padmanabhan (Infosys)

Agentic AI and multimodal data are the next frontiers for realizing intelligent and autonomous business systems. Learn how Infosys innovates with Databricks for accelerating data to AI agent journey at scale across an enterprise. Hear our pragmatic capability driven approach instead of use case-based approach to bring the data universe, AI foundations, agent management, data and AI governance and collaboration under unified management.

Sponsored by: Salesforce | Getting down to viz-ness: How Databricks + Tableau supports your organization’s data & analytics needs

2025-06-11
lightning_talk
Aaron Frein (Tableau)

The explosion of AI has helped make the enterprise data landscape more important, and complex, than ever before. Join us to learn how Databricks’ and Tableau’s platforms come together to empower users of all kinds to see, understand, and act on your data in a secure, governed, and performant way.

Sponsored by: EY | Women in Data + AI

2025-06-11
talk
Lisa Cohen (Anthropic) , Ellen Sulcs (T-Mobile) , Kate Ostbye (Pfizer) , BARBARA LATULIPPE (Takeda Pharmaceuticals - USA) , Robin Sutara (Databricks) , Traci Gusher (E&Y (HQ))

How do top leaders stay ahead in a field moving at lightning speed? Join us for a dynamic panel featuring senior women leaders in data and AI as they share reflections on their inspiring career journeys and how they’re navigating the next wave of technological transformation. These leaders are charting the course for how AI and other emerging technologies are reshaping entire industries. The conversation will offer both practical insights and personal reflections—followed by a networking reception designed to connect and further build this global community. Sponsored by EY.