talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

52

Filtering by: Data Governance ×

Sessions & talks

Showing 26–50 of 52 · Newest first

Search within this event →
Sponsored by: Atlan | Domain-driven Data Governance in the AI Era: A Conversation with General Motors and Atlan

Sponsored by: Atlan | Domain-driven Data Governance in the AI Era: A Conversation with General Motors and Atlan

2025-06-11 Watch
lightning_talk

Now the largest automaker in the United States, selling more than 2.7 million vehicles in 2024, General Motors is setting a bold vision for its future, with Software-defined vehicles and AI as a driving force. With data as a crucial asset, a transformation of this scale calls for a modern approach to Data Governance. Join Sherri Adame, Enterprise Data Governance Leader at General Motors, to learn about GM’s novel governance approach, supported by technologies like Atlan and Databricks. Hear how Sherri and her team are shifting governance to the left with automation, implementing data contracts, and accelerating data product discovery across domains, creating a cultural shift that emphasizes data as a competitive advantage.

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

2025-06-11 Watch
lightning_talk
Bindu Birur (KPMG)

Global Data at Scale: Powering Front Office Transformation with DatabricksJoin KPMG for an engaging session on how we transformed our data platform and built a cutting-edge Global Data Store (GDS)—a game-changing data hub for our Front Office Transformation (FOT). Discover how we seamlessly unified data from various member firms, turning it into a dynamic engine for and enabled our business to leverage our Front Office ecosystem to enable smarter analytics and decision-making. Learn about our unique approach that rapidly integrates diverse datasets into the GDS and our hub-and-spoke model, connecting member firms’ data lakes, enabling secure, high-speed collaboration via Delta Sharing. Hear how we are leveraging Unity Catalog to help ensure data governance, compliance, and straight forward data lineage. We’ll share strategies for risk management, security (fine-grained access, encryption), and scaling a cloud-based data ecosystem.

Agent Bricks: Building Multi-Agent Systems for Structured and Unstructured Information

Agent Bricks: Building Multi-Agent Systems for Structured and Unstructured Information

2025-06-11 Watch
talk
Elise Gonzales (Databricks)

Learn how to build sophisticated systems that enable natural language interactions with both your structured databases and unstructured document collections. This session explores advanced techniques for creating unified and governed AI systems that can seamlessly interpret questions, retrieve relevant information and generate accurate answers across your entire data ecosystem. Key takeaways include: Strategies for combining vector search over unstructured documents with retrieval from structured databases Techniques for optimizing unstructured data processing through effective parsing, metadata enrichment and intelligent chunking Methods for integrating different retrieval mechanisms while ensuring consistent data governance and security Practical approaches for evaluating and improving KBQA system quality through automated and human feedback

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

2025-06-11 Watch
lightning_talk
hwaeium yeom (KRAFTON)

Join us as we explore how KRAFTON optimized data governance for PUBG IP, enhancing cost efficiency and scalability. KRAFTON operates a massive data ecosystem, processing tens of terabytes daily. As real-time analytics demands increased, traditional Batch-based processing faced scalability challenges. To address this, we redesigned data pipelines and governance models, improving performance while reducing costs. Transitioned to real-time pipelines (batch to streaming) Optimized workload management (reducing all-purpose clusters, increasing Jobs usage) Cut costs by tens of thousands monthly (up to 75%) Enhanced data storage efficiency (lower S3 costs, Delta Tables) Improved pipeline stability (Medallion Architecture) Gain insights into how KRAFTON scaled data operations, leveraging real-time analytics and cost optimization for high-traffic games. Learn more: https://www.databricks.com/customers/krafton

Hands-on Learning: AI-Powered Data Engineering with Lakeflow: Techniques for Modern Data Professionals

2025-06-11
talk
Frank Munz (Databricks)

This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: Introduction to Lakeflow and the Data Intelligence Platform Getting started with Lakeflow Declarative Pipelines for declarative data pipelines in SQL using Streaming Tables and Materialized Views Mastering Databricks Workflows with advanced control flow and triggers Understanding serverless compute Data governance and lineage with Unity Catalog Generative AI for Data Engineers: Genie and Databricks Assistant We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.

Advanced Data Access Control for the Exabyte Era: Scaling with Purpose

Advanced Data Access Control for the Exabyte Era: Scaling with Purpose

2025-06-11 Watch
talk
Arpan Ghosh (Databricks) , Shuting Zhang (Databricks)

As data-driven companies scale from small startups to global enterprises, managing secure data access becomes increasingly complex. Traditional access control models fall short at enterprise scale, where dynamic, purpose-driven access is essential. In this talk, we explore how our “Just-in-Time” Purpose-Based Access Control (PBAC) platform addresses the evolving challenges of data privacy and compliance, maintaining least privilege while ensuring productivity. Using features like Unity Catalog, Delta Sharing & Databricks Apps, the platform delivers real-time, context-aware data governance. Leveraging JIT PBAC keeps your data secure, your engineers productive, your legal & security teams happy and your organization future-proof in the ever-evolving compliance landscape.

Transforming Data Governance for Multimodal Data at Amgen With Databricks

Transforming Data Governance for Multimodal Data at Amgen With Databricks

2025-06-11 Watch
talk
Jaison Dominic (Amgen) , Jinesh Kunjumon (AMGEN)

Amgen is advancing its Enterprise Data Fabric to securely manage sensitive multimodal data, such as imaging and research data, across formats.Databricks is already the de facto standard for governance on structured data, and Amgen seeks to extend it for unstructured multi modal data too. This approach will also allow Amgen to standardize its GenAI projects on Databricks. Key priorities include: Centralized data access: establishing a unified, secure access control system Enhanced traceability: implementing detailed processes for transparency and accountability Consistent access standards: ensuring uniform data access privilege experience User support: providing flexible access for diverse stakeholders Comprehensive auditing: enabling thorough permission audits and data usage tracking Learn strategies for implementing a comprehensive multimodal data governance framework using Databricks, as we share our experience on standardizing data governance for GenAI use cases.

Doordash Customer 360 Data Store and its Evolution to Become an Entity Management Framework

Doordash Customer 360 Data Store and its Evolution to Become an Entity Management Framework

2025-06-10 Watch
lightning_talk
Gowri Shankar (Doordash) , Chao Wang (DoorDash)

The "Doordash Customer 360 Data Store" represents a foundational step in centralizing and managing customer profile to enable targeting and personalized customer experiences built on Delta Lake. This presentation will explore the initial goals and architecture of the Customer 360 Data Store, its journey to becoming a robust entity management framework, and the challenges and opportunities encountered along the way. We will discuss how the evolution addressed scalability, data governance and integration needs, enabling the system to support dynamic and diverse use cases, including customer lifecycle analytics, marketing campaign targeting using segmentation. Attendees will gain insight into key design principles, technical innovations and strategic decisions that transformed the system into a flexible platform for entity management, positioning it as a critical enabler of data-driven growth at Doordash. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Managing the Governed Cloud

Managing the Governed Cloud

2025-06-10 Watch
talk
Sherri Adame (GM) , Johnathan Powell (General Motors)

As organizations increasingly adopt Databricks as a unified platform for analytics and AI, ensuring robust data governance becomes critical for compliance, security, and operational efficiency. This presentation will explore the end-to-end framework for governing the Databricks cloud, covering key use cases, foundational governance principles, and scalable automation strategies. We will discuss best practices for metadata, data access, catalog, classification, quality, and lineage, while leveraging automation to streamline enforcement. Attendees will gain insights into best practices and real-world approaches to building a governed data cloud that balances innovation with control.

Streaming Meets Governance: Building AI-Ready Tables With Confluent Tableflow and Unity Catalog

Streaming Meets Governance: Building AI-Ready Tables With Confluent Tableflow and Unity Catalog

2025-06-10 Watch
talk
Kasun Indrasiri Gamage (Confluent) , Victoria Bukta (Databricks)

Learn how Databricks and Confluent are simplifying the path from real-time data to governed, analytics- and AI-ready tables. This session will cover how Confluent Tableflow automatically materializes Kafka topics into Delta tables and registers them with Unity Catalog — eliminating the need for custom streaming pipelines. We’ll walk through how this integration helps data engineers reduce ingestion complexity, enforce data governance and make real-time data immediately usable for analytics and AI.

IQVIA's Analytics for Patient Support Services: Transforming Scalability, Performance and Governance

IQVIA's Analytics for Patient Support Services: Transforming Scalability, Performance and Governance

2025-06-10 Watch
talk
Dmytro Kobryn (Customertimes) , Sudha Ragothaman (IQVIA)

This presentation will explore the transformation of IQVIA's decade-old patient support platform through the implementation of Databricks Data Intelligence Platform. Facing scalability challenges, performance bottlenecks and rising costs, the existing platform required significant redesign to handle growing data volumes and complex analytics. Key issues included static metrics limiting workflow optimization, fragmented data governance and heightened compliance and security demands. By partnering with Customertimes (a Databricks Partner) and adopting Databricks' centralized, scalable analytics solution with enhanced self-service capabilities, IQVIA achieved improved query performance, cost efficiency and robust governance, ensuring operational effectiveness and regulatory compliance in an increasingly complex environment.

Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks Lakeflow Declarative Pipelines, Protobuf and BSR

2025-06-10
talk
Dwight Whitlock (Clinician Nexus)

Red Stapler is a streaming-native system on Databricks that merges file-based ingestion and real-time user edits into one Lakeflow Declarative Pipelines for near real-time feedback. Protobuf definitions, managed in the Buf Schema Registry (BSR), govern schema and data-quality rules, ensuring backward compatibility. All records — valid or not — are stored in an SCD Type 2 table, capturing every version for full history and immediate quarantine views of invalid data. This unified approach boosts data governance, simplifies auditing and streamlines error fixes.Running on Lakeflow Declarative Pipelines Serverless and the Kafka-compatible Bufstream keeps costs low by scaling down to zero when idle. Red Stapler’s configuration-driven Protobuf logic adapts easily to evolving survey definitions without risking production. The result is consistent validation, quick updates and a complete audit trail — all critical for trustworthy, flexible data pipelines.

Dealing With Sensitive Data on Databricks at Natura

Dealing With Sensitive Data on Databricks at Natura

2025-06-10 Watch
lightning_talk
Daniel Shimura (Natura)

Ensuring the protection of sensitive data within a Databricks environment requires robust mechanisms to prevent unauthorized access, even by high-privileged roles such as Databricks Administrators: Account Console Admins, Workspace Admins, and Unity Catalog Admins. To address this, a comprehensive data governance and access control strategy can be implemented, leveraging encryption, secret scope, column mask, fine-grained access on tables and auditing capabilities.

Sponsored by: Informatica | Extending Unity Catalog to Govern the Data Estate With Informatica Cloud Data Governance & Catalog

Sponsored by: Informatica | Extending Unity Catalog to Govern the Data Estate With Informatica Cloud Data Governance & Catalog

2025-06-10 Watch
lightning_talk
Ajay GOLLAPALLI (Informatica)

Join this 20-minute session to learn how Informatica CDGC integrates with and leverages Unity Catalog metadata to provide end-to-end governance and security across an enterprise data landscape. Topics covered will include: Comprehensive data lineage that provides complete data transformation visibility across multicloud and hybrid environments -Broad data source support to facilitate holistic cataloging and a centralized governance framework Centralized access policy management and data stewardship to enable compliance with regulatory standards Rich data quality to ensure data is cleansed, validated and trusted for analytics and AI

Trust You Can Measure: Data Quality Standards in The Lakehouse

Trust You Can Measure: Data Quality Standards in The Lakehouse

2025-06-10 Watch
talk
Amit Pahwa (Databricks) , Sergiy Kanyshchev (Databricks)

Do you trust your data? If you’ve ever struggled to figure out which datasets are reliable, well-governed, or safe to use, you’re not alone. At Databricks, our own internal lakehouse faced the same challenge—hundreds of thousands of tables, but no easy way to tell which data met quality standards. In this talk, the Databricks Data Platform team shares how we tackled this problem by building the Data Governance Score—a way to systematically measure and surface trust signals across the entire lakehouse. You’ll learn how we leverage Unity Catalog, governed tags, and enforcement to drive better data decisions at scale. Whether you're a data engineer, platform owner, or business leader, you’ll leave with practical ideas on how to raise the bar for data quality and trust in your own data ecosystem.

Scaling Data Governance: How Unity Catalog is Empowering Picpay's Data Governance Strategy

Scaling Data Governance: How Unity Catalog is Empowering Picpay's Data Governance Strategy

2025-06-10 Watch
talk
Lucas Morelato (PicPay) , Gustavo Tadao Okida (PicPay)

With massive data volume and complexity, scaling data governance became a significant challenge. Centralizing metadata management, ensuring regulatory compliance and controlling data access across multiple platforms turned to be critical to maintaining efficiency and trust.

Sponsored by: Deloitte | Accelerating Biopharmaceutical Breakthroughs with an Innovative Enterprise Data Strategy

Sponsored by: Deloitte | Accelerating Biopharmaceutical Breakthroughs with an Innovative Enterprise Data Strategy

2025-06-10 Watch
lightning_talk
Shri Chary (Deloitte)

In the rapidly evolving life sciences and healthcare industry, leveraging data-as-a-product is crucial for driving innovation and achieving business objectives. Join us to explore how Deloitte is revolutionizing data strategy solutions by overcoming challenges such as data silos, poor data quality, and lack of real-time insights with the Databricks Data Intelligence Platform. Learn how effective data governance, seamless data integration, and scalable architectures support personalized medicine, regulatory compliance, and operational efficiency. This session will highlight how these strategies enable biopharma companies to transform data into actionable insights, accelerate breakthroughs and enhance life sciences outcomes.

How Nubank improves Governance, Security and User Experience with Unity Catalog

How Nubank improves Governance, Security and User Experience with Unity Catalog

2025-06-10 Watch
talk

At Nubank, we successfully migrated to Unity Catalog, addressing the needs of our large-scale data environment with 3k active users, over 4k notebooks and jobs and 1.1 million tables, including sensitive PII data. Our primary objectives were to enhance data governance, security and user experience.Key points: Comprehensive data access monitoring and control implementation Enhanced security measures for handling PII and sensitive data Efficient migration of 4,000+ notebooks and jobs to the new system Improved cataloging and governance for 1.1 million tables Implementation of robust access controls and permissions model Optimized user experience and productivity through centralized data management This migration significantly improved our data governance capabilities, enhanced security measures and provided a more user-friendly experience for our large user base, ultimately leading to better control and utilization of our vast data resources.

Reimagining Data Governance and Access at Atlassian

Reimagining Data Governance and Access at Atlassian

2025-06-10 Watch
lightning_talk
Gerald Nakhle (Atlassian)

Atlassian is rebuilding its central lakehouse from the ground up to deliver a more secure, flexible and scalable data environment. In this session, we’ll share how we leverage Unity Catalog for fine-grained governance and supplement it with Immuta for dynamic policy management, enabling row and column level security at scale. By shifting away from broad, monolithic access controls toward a modern, agile solution, we’re empowering teams to securely collaborate on sensitive data without sacrificing performance or usability. Join us for an inside look at our end-to-end policy architecture, from how data owners declare metadata and author policies to the seamless application of access rules across the platform. We’ll also discuss lessons learned on streamlining data governance, ensuring compliance, and improving user adoption. Whether you’re a data architect, engineer or leader, walk away with actionable strategies to simplify and strengthen your own governance and access practices.

Unleash the Power of Automated Data Governance: Classify, Tag and Protect Your Data — Effortlessly

Unleash the Power of Automated Data Governance: Classify, Tag and Protect Your Data — Effortlessly

2025-06-10 Watch
talk
Zeashan Pappa (Databricks) , Kristen Wilder (Databricks)

Struggling to keep up with data governance at scale? Join us to explore how automated data classification, tag policies and ABAC streamline access control while enhancing security and compliance. Get an exclusive look at the new Governance Hub, built to give your teams deeper visibility into data usage, access patterns and metadata — all in one place. Whether you're managing thousands or millions of assets, discover how to classify, tag and protect your data estate effortlessly with the latest advancements in Unity Catalog.

Comprehensive Data Management and Governance With Azure Data Lake Storage

Comprehensive Data Management and Governance With Azure Data Lake Storage

2025-06-10 Watch
talk
James Baker (Microsoft) , Santhosh Pillai (Microsoft Corporation)

Given that data is the new oil, it must be treated as such. Organizations that pursue greater insight into their businesses and their customers must manage, govern, protect and observe the use of the data that drives these insights in an efficient, cost-effective, compliant and auditable manner without degrading access to that data. Azure Data Lake Storage offers many features which allow customers to apply such controls and protections to their critical data assets. Understanding how these features behave, the granularity, cost and scale implications and the degree of control or protection that they apply are essential to implement a data lake that reflects the value contained within. In this session, the various data protection, governance and management capabilities available now and upcoming in ADLS will be discussed. This will include how deep integration with Azure Databricks can provide a more comprehensive, end-to-end coverage for these concerns, yielding a highly efficient and effective data governance solution.

Revolutionizing Data Insights and the Buyer Experience at GM Financial with Cloud Data Modernization

Revolutionizing Data Insights and the Buyer Experience at GM Financial with Cloud Data Modernization

2025-06-10 Watch
talk
Latha Subramanian (GM Financial) , Rick Whitford (Deloitte Consulting, LLP)

Deloitte and GM (General Motors) Financial have collaborated to design and implement a cutting-edge cloud analytics platform, leveraging Databricks. In this session, we will explore how we overcame challenges including dispersed and limited data capabilities, high-cost hardware and outdated software, with a strategic and comprehensive approach. With the help of Deloitte and Databricks, we were able to develop a unified Customer360 view, integrate advanced AI-driven analytics, and establish robust data governance and cyber security measures. Attendees will gain valuable insights into the benefits realized, such as cost savings, enhanced customer experiences, and broad employee upskilling opportunities. Unlock the impact of cloud data modernization and advanced analytics in the automotive finance industry and beyond with Deloitte and Databricks.

Laying Data and AI Foundations for the Agentic Future at P&G

2025-06-10
talk
Alfredo Colas (Procter & Gamble)

In today's rapidly evolving digital landscape, organizations must prioritize robust data architectures and AI strategies to remain competitive. In this session, we will explore how Procter & Gamble (P&G) has embarked on a transformative journey to digitize its operations via scalable data, analytics and AI platforms, establishing a strong foundation for data-driven decision-making and the emergence of agentic AI.Join us as we delve into the comprehensive architecture and platform initiatives undertaken at P&G to create scalable and agile data platforms unleashing BI/AI value. We will discuss our approach to implementing data governance and semantics, ensuring data integrity and accessibility across the organization. By leveraging advanced analytics and Business Intelligence (BI) tools, we will illustrate how P&G harnesses data to generate actionable insights at scale, all while maintaining security and speed.

Leveraging Databricks Unity Catalog for Enhanced Data Governance in Unipol

Leveraging Databricks Unity Catalog for Enhanced Data Governance in Unipol

2025-06-10 Watch
talk
Beniamino Del Pizzo (Unipol S.p.A.) , Giovanni Cinquepalmi (Data Reply)

In the contemporary landscape of data management, organizations are increasingly faced with the challenges of data segregation, governance and permission management, particularly when operating within complex structures such as holding companies with multiple subsidiaries. Unipol comprises seven subsidiary companies, each with a diverse array of workgroups, leading to a cumulative total of multiple operational groups. This intricate organizational structure necessitates a meticulous approach to data management, particularly regarding the segregation of data and the assignment of precise read-and-write permissions tailored to each workgroup. The challenge lies in ensuring that sensitive data remains protected while enabling seamless access for authorized users. This speech wants to demonstrate how Unity Catalog emerges as a pivotal tool in the daily use of the data platform, offering a unified governance solution that supports data management across diverse AWS environments.

Databricks Without Disruption: A Deep Dive on Catalog Federation with Hive Metastore, Glue, and Snowflake

2025-06-10
talk
John Spencer (Databricks) , Milos Stojanovic (Databricks)

You shouldn’t have to sacrifice data governance just to leverage the tools your business needs. In this session, we will give practical tips on how you can cut through the data sprawl and get a unified view of your data estate in Unity Catalog without disrupting existing workloads. We will walk through how to set up federation with Glue, Hive Metastore, and other catalogs like Snowflake, and show you how powerful new tools help you adopt Databricks at your own pace with no downtime and full interoperability.