talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

178

Filtering by: Analytics ×

Sessions & talks

Showing 1–25 of 178 · Newest first

Search within this event →
Daft and Unity Catalog: A Multimodal/AI-Native Lakehouse

Daft and Unity Catalog: A Multimodal/AI-Native Lakehouse

2025-06-12 Watch
talk
Jay Chia (Eventual)

Modern data organizations have moved beyond big data analytics to also incorporate advanced AI/ML data workloads. These workflows often involve multimodal datasets containing documents, images, long-form text, embeddings, URLs and more. Unity Catalog is an ideal solution for organizing and governing this data at scale. When paired with the Daft open source data engine, you can build a truly multimodal, AI-ready data lakehouse. In this session, we’ll explore how Daft integrates with Unity Catalog’s core features (such as volumes and functions) to enable efficient, AI-driven data lakehouses. You will learn how to ingest and process multimodal data (images, text and videos), run AI/ML transformations and feature extractions at scale, and maintain full control and visibility over your data with Unity Catalog’s fine-grained governance.

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

2025-06-12 Watch
talk
Szehon Ho (Databricks) , Jia Yu (Wherobots Inc.)

The Apache Iceberg™ community is introducing native geospatial type support, addressing key challenges in managing geospatial data at scale, including fragmented formats and inefficiencies in storing large spatial datasets. This talk will delve into the origins of the Iceberg geo type, its specification design and future goals. We will examine the impact on both the geospatial and Iceberg communities, in introducing a standard data warehouse storage layer to the geospatial community, and enabling optimized geospatial analytics for Iceberg users. We will also present a live demonstration of the Iceberg geo data type with Apache Sedona™ and Apache Spark™, showcasing how it simplifies and accelerates geospatial analytics workflows and queries. Finally, we will also provide an in-depth look at its current capabilities and outline the roadmap for future developments, and offer a perspective on its role in advancing geospatial data management in the industry.

Lakeflow Observability: From UI Monitoring to Deep Analytics

Lakeflow Observability: From UI Monitoring to Deep Analytics

2025-06-12 Watch
talk
Saad Ansari (Databricks) , Theresa Hammer (Databricks)

Monitoring data pipelines is key to reliability at scale. In this session, we’ll dive into the observability experience in Lakeflow, Databricks’ unified DE solution — from intuitive UI monitoring to advanced event analysis, cost observability and custom dashboards. We’ll walk through the revamped UX for Lakeflow observability, showing how to: Monitor runs and task states, dependencies and retry behavior in the UI Set up alerts for job and pipeline outcomes + failures Use pipeline and job system tables for historical insights Explore run events and event logs for root cause analysis Analyze metadata to understand and optimize pipeline spend How to build custom dashboards using system tables to track performance data quality, freshness, SLAs and failure trends, and drive automated alerting based on real-time signals This session will help you unlock full visibility into your data workflows.

Latest Innovations in AI/BI Dashboards and Genie

Latest Innovations in AI/BI Dashboards and Genie

2025-06-12 Watch
talk
Miranda Luna (Databricks) , Chao Cai (Databricks)

Discover how the latest innovations in Databricks AI/BI Dashboards and Genie are transforming self-service analytics. This session offers a high-level tour of new capabilities that empower business users to ask questions in natural language, generate insights faster and make smarter decisions. Whether you're a long-time Databricks user or just exploring what's possible with AI/BI, you'll walk away with a clear understanding of how these tools are evolving — and how to leverage them for greater business impact.

Low-Emission Oil & Gas: Engineering the Balance Between Clean and Reliable

Low-Emission Oil & Gas: Engineering the Balance Between Clean and Reliable

2025-06-12 Watch
talk
Krishanu Roy (bp) , Jay Yoon (NOV) , Srinivas Chandolu (BP) , Ali Marzban (NOV)

Join two energy industry leaders as they showcase groundbreaking applications of AI and data solutions in modern oil and gas operations. NOV demonstrates how their Generative AI pipeline revolutionized drilling mud report processing, automating the analysis of 300 reports daily with near-perfect accuracy and real-time analytics capabilities. BP shares how Unity Catalog has transformed their enterprise-wide data strategy, breaking down silos while maintaining robust governance and security. Together, these case studies illustrate how AI and advanced analytics are enabling cleaner, more efficient energy operations while maintaining the reliability demanded by today's market.

Revolutionizing Insurance: How to Drive Growth and Innovation

Revolutionizing Insurance: How to Drive Growth and Innovation

2025-06-12 Watch
talk
Anindita Mahapatra (Databricks) , Porter Orr (The Standard Insurance Company) , Kranthi Nekkalapu (Suncorp) , Adrien de Nazelle (Oliver Wyman)

The insurance industry is rapidly evolving as advances in data and artificial intelligence (AI) drive innovation, enabling more personalized customer experiences, streamlined operations, and improved efficiencies. With powerful data analytics and AI-driven solutions, insurers can automate claims processing, enhance risk management, and make real-time decisions. Leveraging insights from large and complex datasets, organizations are delivering more customer-centric products and services than ever before. Key takeaways: Real-world applications of data and AI in claims automation, underwriting, and customer engagementHow predictive analytics and advanced data modeling help anticipate risks and meet customer needs. Personalization of policies, optimized pricing, and more efficient workflows for greater ROI. Discover how data and AI are fueling growth, improving protection, and shaping the future of the insurance industry!

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

2025-06-12 Watch
lightning_talk
Vishnu Vasanth (e6data)

Data residency laws and legal mandates are driving the need for lakehouses across public and private clouds. This sprawl threatens centralized governance and compliance, while impacting cost, performance, and analytics/AI functionality. This session shows how e6data extends Unity Catalog across hybrid environments for consistent policy enforcement and query execution—regardless of data location—with guarantees around network egress, entitlements, performance, scalability, and cost. Learn how e6data’s “zero-data movement” philosophy powers a cost- and latency-optimized, location-aware architecture. We’ll cover onboarding strategies for hybrid fleets that enforce data movement restrictions and stay close to the data for better performance and lower cost. Discover how a location-aware compute strategy enables hybrid lakehouses with four key value metrics: cross-platform functionality, governed access, low latency, and total cost of ownership.

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

2025-06-12 Watch
talk
Bobby Grubert (RBC Capital Markets) , Antoine Amend (Databricks) , Raul Chavarria (B3 - Bolsa, Brasil e Balcão) , Jimmy Kozlow (Northern Trust)

Growth in capital markets thrives on innovation, agility and real-time insights. This session highlights how leading firms use Databricks’ Data Intelligence Platform to uncover opportunities, optimize trading strategies and deliver personalized client experiences. Learn how advanced analytics and AI help organizations expand their reach, improve decision-making and unlock new revenue streams. Industry leaders share how unified data platforms break down silos, deepen insights and drive success in a fast-changing market. Key takeaways: Predictive analytics and machine learning strategies for growth Real-world examples of optimized trading and enhanced client engagement Tools to innovate while ensuring operational efficiency Discover how data intelligence empowers capital markets firms to thrive in today’s competitive landscape!

Cutting Costs, Not Performance: Optimizing Databricks at Scale

Cutting Costs, Not Performance: Optimizing Databricks at Scale

2025-06-12 Watch
talk
Pedro Ferreira (NTTDATA)

As Databricks transforms data processing, analytics and machine learning, managing platform costs has become crucial for organizations aiming to maximize value while staying within budget. While Databricks offers unmatched scalability and performance, inefficient usage can lead to unexpected cost overruns. This presentation will explore common challenges organizations face in controlling Databricks costs and provide actionable best practices for optimizing resource allocation, preventing over-provisioning and eliminating underutilization. Drawing from NTT DATA’s experience, I'll share how we reduced Databricks costs by up to 50% through strategies like choosing the right compute resource, leveraging manage tables and using Unity Catalog features, such as system tables, to monitor consumption. Join this session to gain practical insights and tools that will empower your team to optimize Databricks without overspending.

Databricks Lakeflow: the Foundation of Data + AI Innovation for Your Industry

Databricks Lakeflow: the Foundation of Data + AI Innovation for Your Industry

2025-06-12 Watch
talk
Sam Sawyer (Databricks) , Ori Zohar (Databricks)

Every analytics, BI and AI project relies on high-quality data. This is why data engineering, the practice of building reliable data pipelines that ingest and transform data, is consequential to the success of these projects. In this session, we'll show how you can use Lakeflow to accelerate innovation in multiple parts of the organization. We'll review real-world examples of Databricks customers using Lakeflow in different industries such as automotive, healthcare and retail. We'll touch on how the foundational data engineering capabilities Lakeflow provides help power initiatives that improve customer experiences, make real-time decisions and drive business results.

Eliminate Hops in Your Streaming Architecture with Zerobus, Part of Lakeflow Connect

2025-06-12
talk
Victoria Bukta (Databricks) , Nikola Obradovic (Databricks)

In this session, we’ll introduce Zerobus Direct Write API, part of Lakeflow Connect, which enables you to push data directly to your lakehouse and simplify ingestion for IOT, clickstreams, telemetry, and more. We’ll start with an overview of the ingestion landscape to date. Then, we'll cover how you can “shift left” with Zerobus, embedding data ingestion into your operational systems to make analytics and AI a core component of the business, rather than an afterthought. The result is a significantly simpler architecture that scales your operations, using this new paradigm to skip unnecessary hops. We'll also highlight one of our early customers, Joby Aviation and how they use Zerobus. Finally, we’ll provide a framework to help you understand when to use Zerobus versus other ingestion offerings—and we’ll wrap up with a live Q&A so that you can hit the ground running with your own use cases.

IQVIA’s Serverless Journey: Enabling Data and AI in a Regulated World

IQVIA’s Serverless Journey: Enabling Data and AI in a Regulated World

2025-06-12 Watch
talk
Alex Esibov (Databricks) , Matthew Schwartz (IQVIA)

Your data and AI use-cases are multiplying. At the same time, there is increased focus and scrutiny to meet sophisticated security and regulatory requirements. IQVIA utilizes serverless use-cases across data engineering, data analytics, and ML and AI, to empower their customers to make informed decisions, support their R&D processes and improve patient outcomes. By leveraging native controls on the platform, serverless enables them to streamline their use cases while maintaining a strong security posture, top performance and optimized costs. This session will go over IQVIA’s journey to serverless, how they met their security and regulatory requirements, and the latest and upcoming enhancements to the Databricks Platform.

Scaling AI/BI Genie: Best Practices for Curating and Managing Production Spaces

Scaling AI/BI Genie: Best Practices for Curating and Managing Production Spaces

2025-06-12 Watch
talk
Shah Amini (Databricks) , Hanlin Sun (Databricks)

Unlock Genie's full potential with best practices for curating, deploying and monitoring Genie spaces at scale. This session offers a deep dive into the latest enhancements and provides practical guidance on designing high-quality spaces, streamlining deployment workflows and implementing robust monitoring to ensure accuracy and performance in production. Ideal for teams aiming to scale conversational analytics, you’ll leave with actionable strategies to keep your Genie spaces efficient, reliable and aligned with business outcomes.

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

2025-06-12 Watch
talk
Matthew Houser (Tealium) , Bob Pisani (Addepar) , Adrian Bolosan (Databricks) , Davis Matson (Health Catalyst)

Join us to discover how leading tech companies accelerate growth using open ecosystems and built-on solutions to foster collaboration, accelerate innovation and create scalable data products. This session will explore how organizations use Databricks to securely share data, integrate with partners and enable teams to build impactful applications powered by AI and analytics. Topics include: Using Delta Sharing for secure, real-time data collaboration across teams and partners Embedding analytics and creating marketplaces to extend product capabilities Building with open standards and governance frameworks to ensure compliance without sacrificing agility Hear real-world examples of how open ecosystems empower organizations to widen the aperture on collaboration, driving better business outcomes. Walk away with insights into how open data sharing and built-on solutions can help your teams innovate faster at scale.

What's New and What's Next: Building Impactful AI/BI Dashboards

What's New and What's Next: Building Impactful AI/BI Dashboards

2025-06-12 Watch
talk
Eason Gao (Databricks) , Rory Jacobs (Databricks)

Ready to take your AI/BI dashboards to the next level? This session dives into the latest capabilities in Databricks AI/BI Dashboards and how to maximize impact across your organization. Learn how data authors can tailor visualizations for different audiences, optimize performance and seamlessly integrate with Genie for a unified analytics experience. We’ll also share practical tips on how business users and data teams can better collaborate — ensuring insights are accessible, actionable and aligned to business goals.

AI-Assisted BI: Everything You Need to Know

AI-Assisted BI: Everything You Need to Know

2025-06-12 Watch
lightning_talk
Chung Wu (Databricks) , Alex Lichen (Databricks)

Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

2025-06-12 Watch
lightning_talk
Michael Randall (Nationwide)

As insurers increasingly leverage IoT data to personalize policy pricing, reconciling disparate datasets across devices, policies, and insurers becomes mission-critical. In this session, learn how Nationwide transitioned from prototype workflows in Dataiku to a hardened data stack on Databricks, enabling scalable data governance and high-impact analytics. Discover how the team orchestrates data reconciliation across Postgres, Oracle, and Databricks to align customer driving behavior with insurer and policy data—ensuring more accurate, fair discounts for policyholders. With Anomalo’s automated monitoring layered on top, Nationwide ensures data quality at scale while empowering business units to define custom logic for proactive stewardship. We’ll also look ahead to how these foundations are preparing the enterprise for unstructured data and GenAI initiatives.

Summit Live: Data Sharing and Collaboration

Summit Live: Data Sharing and Collaboration

2025-06-12 Watch
talk
Zaheera Valani (Databricks)

Hear more on the latest in data collaboration, which is paramount to unlocking business success. Delta Sharing is an open-source approach to share and govern data, AI models, dashboards, and notebooks across clouds and platforms - without the costly need for replication. Databricks Clean Rooms provide safe hosting environments for data collaboration across companies, also without the costly duplication of data. And the Databricks Marketplace is the open marketplace for all your data, analytics, and AI needs.

Bridging BI Tools: Deep Dive Into AI/BI Dashboards for Power BI Practitioners

Bridging BI Tools: Deep Dive Into AI/BI Dashboards for Power BI Practitioners

2025-06-12 Watch
talk
Marius-Cristian Panga (Databricks) , Wasim Ahmad (Databricks)

In the rapidly-evolving field of data analytics, (AI/BI) dashboards and Power BI stand out as two formidable approaches, each offering unique strengths and catering to specific use cases. Power BI has earned its reputation for delivering user-friendly, highly customisable visualisations and reports for data analysis. On the other hand, AI/BI dashboards have gained good traction due to their seamless integration with the Databricks platform, making them an attractive option for data practitioners. This session will provide a comparison of these two tools, highlighting their respective features, strengths and potential limitations. Understanding the nuances between these tools is crucial for organizations aiming to make informed decisions about their data analytics strategy. This session will equip participants with the knowledge needed to select the most appropriate tool or combination of tools to meet their data analysis requirements and drive data-informed decision-making processes.

Databricks in Action: Azure’s Blueprint for Secure and Cost-Effective Operations

Databricks in Action: Azure’s Blueprint for Secure and Cost-Effective Operations

2025-06-12 Watch
talk
Oliver Schluga (Erste Group) , Vukola Milenkovic (Erste Group)

Erste Group's transition to Azure Databricks marked a significant upgrade from a legacy system to a secure, scalable and cost-effective cloud platform. The initial architecture, characterized by a complex hub-spoke design and stringent compliance regulations, was replaced with a more efficient solution. The phased migration addressed high network costs and operational inefficiencies, resulting in a 60% reduction in networking costs and a 30% reduction in compute costs for the central team. This transformation, completed over a year, now supports real-time analytics, advanced machine learning and GenAI while ensuring compliance with European regulations. The new platform features a Unity Catalogue, separate data catalogs and dedicated workspaces, demonstrating a successful shift to a cloud-based machine learning environment with significant improvements in cost, performance and security.

From 10 Hours to 10 Minutes: Unleashing the Power of Lakeflow Declarative Pipelines

2025-06-12
talk
Sidney Cardoso (Michelin) , Yash Joshi (Accenture)

How do you transform a data pipeline from sluggish 10-hour batch processing into a real-time powerhouse that delivers insights in just 10 minutes? This was the challenge we tackled at one of France's largest manufacturing companies, where data integration and analytics were mission-critical for supply chain optimization. Power BI dashboards needed to refresh every 15 minutes. Our team struggled with legacy Azure Data Factory batch pipelines. These outdated processes couldn’t keep up, delaying insights and generating up to three daily incident tickets. We identified Lakeflow Declarative Pipelines and Databricks SQL as the game-changing solution to modernize our workflow, implement quality checks, and reduce processing times.In this session, we’ll dive into the key factors behind our success: Pipeline modernization with Lakeflow Declarative Pipelines: improving scalability Data quality enforcement: clean, reliable datasets Seamless BI integration: Using Databricks SQL to power fast, efficient queries in Power BI

Get the Most of Your Delta Lake

Get the Most of Your Delta Lake

2025-06-12 Watch
lightning_talk
Youssef Mrini (Databricks)

Unlock the full potential of Delta Lake, the open-source storage framework for Apache Spark, with this session focused on its latest and most impactful features. Discover how capabilities like Time Travel, Column Mapping, Deletion Vectors, Liquid Clustering, UniForm interoperability, and Change Data Feed (CDF) can transform your data architecture. Learn not just what these features do, but when and how to use them to maximize performance, simplify data management, and enable advanced analytics across your lakehouse environment.

Leveling Up Gaming Analytics: How Supercell Evolved Player Experiences With Snowplow and Databricks

Leveling Up Gaming Analytics: How Supercell Evolved Player Experiences With Snowplow and Databricks

2025-06-12 Watch
lightning_talk
Alex Dean (Snowplow)

In the competitive gaming industry, understanding player behavior is key to delivering engaging experiences. Supercell, creators of Clash of Clans and Brawl Stars, faced challenges with fragmented data and limited visibility into user journeys. To address this, they partnered with Snowplow and Databricks to build a scalable, privacy-compliant data platform for real-time insights. By leveraging Snowplow’s behavioral data collection and Databricks’ Lakehouse architecture, Supercell achieved: Cross-platform data unification: A unified view of player actions across web, mobile and in-game Real-time analytics: Streaming event data into Delta Lake for dynamic game balancing and engagement Scalable infrastructure: Supporting terabytes of data during launches and live events AI & ML use cases: Churn prediction and personalized in-game recommendations This session explores Supercell’s data journey and AI-driven player engagement strategies.

Optimizing Smart Meter IIoT Data in Databricks for At-Scale Interactive Electrical Load Analytics

Optimizing Smart Meter IIoT Data in Databricks for At-Scale Interactive Electrical Load Analytics

2025-06-12 Watch
talk
David Gibbon (Plotly)

Octave is a Plotly Dash application used daily by about 1,000 Hydro-Québec technicians and engineers to analyze smart meter load and voltage data from 4.5M meters across the province. As adoption grew, Octave’s back end was migrated to Databricks to address increasingly massive scale (>1T data points), governance and security requirements. This talk will summarize how Databricks was optimized to support performant at-scale interactive Dash application experiences while in parallel managing complex back-end ETL processes. The talk will outline optimizations targeted to further optimize query latency and user concurrency, along with plans to increase data update frequency. Non-technology related success factors to be reviewed will include the value of: subject matter expertise, operational autonomy, code quality for long-term maintainability and proactive vendor technical support.

Powering Secure and Scalable Data Governance at PepsiCo With Unity Catalog Open APIs

Powering Secure and Scalable Data Governance at PepsiCo With Unity Catalog Open APIs

2025-06-12 Watch
talk
Dipankar Kushari (Databricks) , Sudipta Das (PepsiCo)

PepsiCo, given its scale, has numerous teams leveraging different tools and engines to access data and perform analytics and AI. To streamline governance across this diverse ecosystem, PepsiCo unifies its data and AI assets under an open and enterprise-grade governance framework with Unity Catalog. In this session, we'll explore real-world examples of how PepsiCo extends Unity Catalog’s governance to all its data and AI assets, enabling secure collaboration even for teams outside Databricks. Learn how PepsiCo architects permissions using service principals and service accounts to authenticate with Unity Catalog, building a multi-engine architecture with seamless and open governance. Attendees will gain practical insights into designing a scalable, flexible data platform that unifies governance across all teams while embracing openness and interoperability.