talk-data.com talk-data.com

Event

Data + AI Summit 2025

2025-06-09 – 2025-06-13 Databricks Summit Visit website ↗

Activities tracked

178

Filtering by: Analytics ×

Sessions & talks

Showing 76–100 of 178 · Newest first

Search within this event →
Intelligent Document Processing: Building AI, BI, and Analytics Systems on Unstructured Data

Intelligent Document Processing: Building AI, BI, and Analytics Systems on Unstructured Data

2025-06-11 Watch
talk
Adam Gurary (Databricks) , Jason Ping (Product) (Databricks)

Most enterprise data is trapped in unstructured formats — documents, PDFs, scanned images and tables — making it difficult to access, analyze and use. This session shows how to unlock that hidden value by building intelligent document processing workflows on the Databricks Data Intelligence Platform. You’ll learn how to ingest unstructured content using Lakeflow Connect, extract structured data with AI Parse — even from complex tables and scanned documents — and apply analytics or AI to this newly structured data. What you’ll learn: How to build scalable pipelines that transform unstructured documents into structured tables Techniques for automating document workflows with Databricks tools Strategies for maintaining quality and governance with Unity Catalog Real-world examples of AI applications built with intelligent document processing

Serverless as the New "Easy Button": How HP Inc. Used Serverless to Turbocharge Their Data Pipeline

Serverless as the New "Easy Button": How HP Inc. Used Serverless to Turbocharge Their Data Pipeline

2025-06-11 Watch
talk
Matthew Wright (Zahlen Solutions LLC) , Jason Hart (Zahlen Solutions)

How do you wrangle over 8TB of granular “hit-level” website analytics data with hundreds of columns, all while eliminating the overhead of cluster management, decreasing runtime and saving money? In this session, we’ll dive into how we helped HP Inc. use Databricks serverless compute and Lakeflow Declarative Pipelines to streamline Adobe Analytics data ingestion while making it faster, cheaper and easier to operate. We’ll walk you through our full migration story — from managing unwieldy custom-defined AWS-based Apache Spark™ clusters to spinning up Databricks serverless pipelines and workflows with on-demand scalability and near-zero overhead. If you want to simplify infrastructure, optimize performance and get more out of your Databricks workloads, this session is for you.

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

Sponsored by: Informatica | Modernize analytics and empower AI in Databricks with trusted data using Informatica

2025-06-11 Watch
talk
Rik Tamm-Daniels (Informatica) , Ajay GOLLAPALLI (Informatica)

As enterprises continue their journey to the cloud, data warehouse and data management modernization is essential to optimize analytics and drive business outcomes. Minimizing modernization timelines is important for reducing risk and shortening time to value – and ensuring enterprise data is clean, curated and governed is imperative to enable analytics and AI initiatives. In this session, learn how Informatica's Intelligent Data Management Cloud (IDMC) empowers analytics and AI on Databricks by helping data teams: · Develop no-code/low-code data pipelines that ingest, transform and clean data at enterprise scale · Improve data quality and extend enterprise governance with Informatica Cloud Data Governance and Catalog (CDGC) and Unity Catalog · Accelerate pilot-to-production with Mosaic AI

Tech Industry Session: Optimizing Costs and Controls to Democratize Data and AI

Tech Industry Session: Optimizing Costs and Controls to Democratize Data and AI

2025-06-11 Watch
talk
Miranda Luna (Databricks) , Anup Segu (YipitData) , Vivek Srivastava (OT Technology, LLC)

Join us for this session focused on how leading tech companies are enabling data intelligence across their organizations while maintaining cost efficiency and governance. Hear the successes and the challenges when Databricks empowers thousands of users—from engineers to business teams—by providing scalable tools for AI, BI and analytics. Topics include: Combining AI/BI and Lakehouse Apps to streamline workflows and accelerate insights Implementing systems tables, tagging and governance frameworks for granular control Democratizing data access while optimizing costs for large-scale analytical workloads Hear from customers and Databricks experts, followed by a customer panel featuring industry leaders. Gain insights into how Databricks helps tech innovators scale their platforms while maintaining operational excellence.

The Upcoming Apache Spark 4.1: The Next Chapter in Unified Analytics

The Upcoming Apache Spark 4.1: The Next Chapter in Unified Analytics

2025-06-11 Watch
talk
DB Tsai (Databricks) , Xiao Li (Databricks)

Apache Spark has long been recognized as the leading open-source unified analytics engine, combining a simple yet powerful API with a rich ecosystem and top-notch performance. In the upcoming Spark 4.1 release, the community reimagines Spark to excel at both massive cluster deployments and local laptop development. We’ll start with new single-node optimizations that make PySpark even more efficient for smaller datasets. Next, we’ll delve into a major “Pythonizing” overhaul — simpler installation, clearer error messages and Pythonic APIs. On the ETL side, we’ll explore greater data source flexibility (including the simplified Python Data Source API) and a thriving UDF ecosystem. We’ll also highlight enhanced support for real-time use cases, built-in data quality checks and the expanding Spark Connect ecosystem — bridging local workflows with fully distributed execution. Don’t miss this chance to see Spark’s next chapter!

Accelerating Data Transformation: Best Practices for Governance, Agility and Innovation

Accelerating Data Transformation: Best Practices for Governance, Agility and Innovation

2025-06-11 Watch
lightning_talk
Kevin Wilson (NCS Australia)

In this session, we will share NCS’s approach to implementing a Databricks Lakehouse architecture, focusing on key lessons learned and best practices from our recent implementations. By integrating Databricks SQL Warehouse, the DBT Transform framework and our innovative test automation framework, we’ve optimized performance and scalability, while ensuring data quality. We’ll dive into how Unity Catalog enabled robust data governance, empowering business units with self-serve analytical workspaces to create insights while maintaining control. Through the use of solution accelerators, rapid environment deployment and pattern-driven ELT frameworks, we’ve fast-tracked time-to-value and fostered a culture of innovation. Attendees will gain valuable insights into accelerating data transformation, governance and scaling analytics with Databricks.

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

2025-06-11 Watch
lightning_talk
Allison Wang (Databricks) , LU QIU (LanceDB)

PySpark has long been a cornerstone of big data processing, excelling in data preparation, analytics and machine learning tasks within traditional data lakes. However, the rise of multimodal AI and vector search introduces challenges beyond its capabilities. Spark’s new Python data source API enables integration with emerging AI data lakes built on the multi-modal Lance format. Lance delivers unparalleled value with its zero-copy schema evolution capability and robust support for large record-size data (e.g., images, tensors, embeddings, etc), simplifying multimodal data storage. Its advanced indexing for semantic and full-text search, combined with rapid random access, enables high-performance AI data analytics to the level of SQL. By unifying PySpark's robust processing capabilities with Lance's AI-optimized storage, data engineers and scientists can efficiently manage and analyze the diverse data types required for cutting-edge AI applications within a familiar big data framework.

Driving Trusted Insights With AI/BI and Unity Catalog Metric Views

Driving Trusted Insights With AI/BI and Unity Catalog Metric Views

2025-06-11 Watch
lightning_talk
Fuat Can Efeoglu (Databricks)

Deliver trusted, high-performance insights by incorporating Unity Catalog metric views and business semantics into your AI/BI workflows. This session dives into the architecture and best practices for defining reusable metrics, implementing governance and enhancing query performance in AI/BI Dashboards and Genie. Learn how to manage business semantics effectively to ensure data consistency while empowering business users with governed, self-service analytics. Ideal for teams looking to streamline analytics at scale, this session provides practical strategies for driving data accuracy and governance.

Sponsored by: Accenture & Avanade | Reinventing State Services with Databricks: AI-Driven Innovations in Health and Transportation

Sponsored by: Accenture & Avanade | Reinventing State Services with Databricks: AI-Driven Innovations in Health and Transportation

2025-06-11 Watch
lightning_talk
Ajali Sen (Accenture)

One of the largest and trailblazing U.S. states is setting a new standard for how governments can harness data and AI to drive large-scale impact. In this session, we will explore how we are using the Databricks Data Intelligence Platform to address two of the state's most pressing challenges: public health and transportation. From vaccine tracking powered by intelligent record linkage and a service-oriented analytics architecture, to Gen AI-driven insights that reduce traffic fatalities and optimize infrastructure investments, this session reveals how scalable, secure, and real-time data solutions are transforming state operations. Join us to learn how data-driven governance is delivering better outcomes for millions—and paving the way for an AI enabled, data driven and more responsive government.

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

Sponsored by: Hexaware | Global Data at Scale: Powering Front Office Transformation with Databricks

2025-06-11 Watch
lightning_talk
Bindu Birur (KPMG)

Global Data at Scale: Powering Front Office Transformation with DatabricksJoin KPMG for an engaging session on how we transformed our data platform and built a cutting-edge Global Data Store (GDS)—a game-changing data hub for our Front Office Transformation (FOT). Discover how we seamlessly unified data from various member firms, turning it into a dynamic engine for and enabled our business to leverage our Front Office ecosystem to enable smarter analytics and decision-making. Learn about our unique approach that rapidly integrates diverse datasets into the GDS and our hub-and-spoke model, connecting member firms’ data lakes, enabling secure, high-speed collaboration via Delta Sharing. Hear how we are leveraging Unity Catalog to help ensure data governance, compliance, and straight forward data lineage. We’ll share strategies for risk management, security (fine-grained access, encryption), and scaling a cloud-based data ecosystem.

Sponsored by: Tiger Analytics | Data-Driven Transformation to Hypercharge Predictive and Diagnostic Supply Chain Intelligence

Sponsored by: Tiger Analytics | Data-Driven Transformation to Hypercharge Predictive and Diagnostic Supply Chain Intelligence

2025-06-11 Watch
lightning_talk
Vishal Puri (Tiger Analytics)

Manufacturers today need efficient, accurate, and flexible integrated planning across supply, demand, and finance. A leading industrial manufacturer is pursuing a competitive edge in Integrated Business Planning through data and AI.Their strategy: a connected, real-time data foundation with democratized access across silos. Using Databricks, we’re building business-centric data products to enable near real-time, collaborative decisions and scaled AI. Unity Catalog ensures data reliability and adoption. Increased data visibility is driving better on-time delivery, inventory optimization, and forecasting,resulting in measurable financial impact. In this session, we’ll share our journey to the north star of “driving from the windshield, not the rearview,” including key data, organization, and process challenges in enabling data democratization; architectural choices for Integrated Business Planning as a data product; and core capabilities delivered with Tiger’s Accelerator.

Achieving AI Success with a Solid Data Foundation

Achieving AI Success with a Solid Data Foundation

2025-06-11 Watch
talk
Santosh Kudva (GE Vernova) , Kevin Tollison (EY)

Join for an insightful presentation on creating a robust data architecture to drive business outcomes in the age of Generative AI. Santosh Kudva, GE Vernova Chief Data Officer and Kevin Tollison, EY AI Consulting Partner, will share their expertise on transforming data strategies to unleash the full potential of AI. Learn how GE Vernova, a dynamic enterprise born from the 2024 spin-off of GE, revamped its diverse landscape. They will provide a look into how they integrated the pre-spin-off Finance Data Platform into the GE Vernova Enterprise Data & Analytics ecosystem utilizing Databricks to enable high-performance AI-led analytics. Key insights include: Incorporating Generative AI into your overarching strategy Leveraging comprehensive analytics to enhance data quality Building a resilient data framework adaptable to continuous evolution Don't miss this opportunity to hear from industry leaders and gain valuable insights to elevate your data strategy and AI success.

Building a Self-Service Data Platform With a Small Data Team

Building a Self-Service Data Platform With a Small Data Team

2025-06-11 Watch
talk
Gleb Lesnikov (Dodo Brands) , Evgenii Dobrynin (Dodo Brands)

Discover how Dodo Brands, a global pizza and coffee business with over 1,200 retail locations and 40k employees, revolutionized their analytics infrastructure by creating a self-service data platform. This session explores the approach to empowering analysts, data scientists and ML engineers to independently build analytical pipelines with minimal involvement from data engineers. By leveraging Databricks as the backbone of their platform, the team developed automated tools like a "job-generator" that uses Jinja templates to streamline the creation of data jobs. This approach minimized manual coding and enabled non-data engineers to create over 1,420 data jobs — 90% of which were auto-generated by user configurations. Supporting thousands of weekly active users via tools like Apache Superset. This session provides actionable insights for organizations seeking to scale their analytics capabilities efficiently without expanding their data engineering teams.

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

Franchise IP and Data Governance at Krafton: Driving Cost Efficiency and Scalability

2025-06-11 Watch
lightning_talk
hwaeium yeom (KRAFTON)

Join us as we explore how KRAFTON optimized data governance for PUBG IP, enhancing cost efficiency and scalability. KRAFTON operates a massive data ecosystem, processing tens of terabytes daily. As real-time analytics demands increased, traditional Batch-based processing faced scalability challenges. To address this, we redesigned data pipelines and governance models, improving performance while reducing costs. Transitioned to real-time pipelines (batch to streaming) Optimized workload management (reducing all-purpose clusters, increasing Jobs usage) Cut costs by tens of thousands monthly (up to 75%) Enhanced data storage efficiency (lower S3 costs, Delta Tables) Improved pipeline stability (Medallion Architecture) Gain insights into how KRAFTON scaled data operations, leveraging real-time analytics and cost optimization for high-traffic games. Learn more: https://www.databricks.com/customers/krafton

From Prediction to Prevention: Transforming Risk Management in Insurance

From Prediction to Prevention: Transforming Risk Management in Insurance

2025-06-11 Watch
talk
Sebastien Gignac (Intact Financial Corp) , Dylani Herath (Thrivent Financial) , Marcela Granados (Databricks) , Michael Ban (Nationwide)

Protecting insurers against emerging threats is critical. This session reveals how leading companies use Databricks’ Data Intelligence Platform to transform risk management, enhance fraud detection, and ensure compliance. Learn how advanced analytics, AI, and machine learning process vast data in real time to identify risks and mitigate threats. Industry leaders will share strategies for building resilient operations that protect against financial losses and reputational harm. Key takeaways: AI-powered fraud prevention using anomaly detection and predictive analytics Real-time risk assessment models integrating IoT, behavioral, and external data Strategies for robust compliance and governance with operational efficiency Discover how data intelligence is revolutionizing insurance risk management and safeguarding the industry’s future.

Lakebase: Fully Managed Postgres for the Lakehouse

Lakebase: Fully Managed Postgres for the Lakehouse

2025-06-11 Watch
talk
Abbey Russell (Databricks) , Dave Nettleton (Databricks)

Lakebase is a new Postgres-compatible OLTP database designed to support intelligent applications. Lakebase eliminates custom ETL pipelines with built-in lakehouse table synchronization, supports sub-10ms latency for high-throughput workloads, and offers full Postgres compatibility, so you can build applications more quickly.In this session, you’ll learn how Lakebase enables faster development, production-level concurrency, and simpler operations for data engineers and application developers building modern, data-driven applications. We'll walk through key capabilities, example use cases, and how Lakebase simplifies infrastructure while unlocking new possibilities for AI and analytics.

Lakeflow Connect: Seamless Data Ingestion From Enterprise Apps

2025-06-11
talk
Manish Dalwadi (Databricks) , Andreas Maier (Porsche Informatik GmbH)

Lakeflow Connect enables you to easily and efficiently ingest data from enterprise applications like Salesforce, ServiceNow, Google Analytics, SharePoint, NetSuite, Dynamics 365 and more. In this session, we’ll dive deep on using connectors for the most popular SaaS applications, reviewing common use cases such as analyzing consumer behavior, predicting churn and centralizing HR analytics. You'll also hear from an early customer about how Lakeflow Connect helped unify their customer data to drive an improved automotive experience. We’ll wrap up with a Q&A so you have the opportunity to learn from our experts.

Modernizing Critical Infrastructure: AI and Data-Driven Solutions in Nuclear and Utility Operations

Modernizing Critical Infrastructure: AI and Data-Driven Solutions in Nuclear and Utility Operations

2025-06-11 Watch
talk
Lou Martinez Sancho (Westinghouse Electric Company) , Shane Powell (Alabama Power) , Nick Whatley (Southern Company) , Amar Sethi (Databricks)

This session showcases how both Westinghouse Electric and Alabama Power Company are leveraging cloud-based tools, advanced analytics, and machine learning to transform operational resilience across the energy sector. In the first segment, we'll explore AI's crucial role in enhancing safety, efficiency, and compliance in nuclear operations through technologies like HiVE and Bertha, focusing on the unique reliability and credibility requirements of the regulated nuclear industry. We’ll then highlight how Alabama Power Company has modernized its grid management and storm preparedness by using Databricks to develop SPEAR and RAMP—applications that combine real-time data and predictive analytics to improve reliability, efficiency, and customer service.

Retail Genie: No-Code AI Apps for Empowering BI Users to be Self-Sufficient

Retail Genie: No-Code AI Apps for Empowering BI Users to be Self-Sufficient

2025-06-11 Watch
talk
Harish Rajagopalan (Databricks) , Siddhesh Pore (Databricks)

Explore how Databricks AI/BI Genie revolutionizes retail analytics, empowering business users to become self-reliant data explorers. This session highlights no-code AI apps that create a conversational interface for retail data analysis. Genie spaces harness NLP and generative AI to convert business questions into actionable insights, bypassing complex SQL queries. We'll showcase retail teams effortlessly analyzing sales trends, inventory and customer behavior through Genie's intuitive interface. Witness real-world examples of AI/BI Genie's adaptive learning, enhancing accuracy and relevance over time. Learn how this technology democratizes data access while maintaining governance via Unity Catalog integration. Discover Retail Genie's impact on decision-making, accelerating insights and cultivating a data-driven retail culture. Join us to see the future of accessible, intelligent retail analytics in action.

Revolutionizing Banking Data, Analytics and AI: Building an Enterprise Data Hub With Databricks

Revolutionizing Banking Data, Analytics and AI: Building an Enterprise Data Hub With Databricks

2025-06-11 Watch
talk
Shailender Sidhu (Deloitte) , Mohan Sankararaman (First Horizon Bank) , Jamie Cosgrove (Databricks)

Explore the transformative journey of a regional bank as it modernizes its enterprise data infrastructure amidst the challenges of legacy systems and past mergers and acquisitions. The bank is creating an Enterprise Data Hub using Deloitte's industry experience and the Databricks Data Intelligence Platform to drive growth, efficiency and Large Financial Institution readiness needs. This session will showcase how the new data hub will be a one-stop-shop for LOB and enterprise needs, while unlocking the advanced analytics and GenAI possibilities. Discover how this initiative is going to empower the ambitions of a regional bank to realize their “big bank muscle, small bank hustle.”

Self-Service Assortment and Space Analytics at Walmart Scale

Self-Service Assortment and Space Analytics at Walmart Scale

2025-06-11 Watch
talk
Alexandro Arreola-Garcia (Walmart) , Nikit Shah (Databricks)

Assortment and space analytics optimizes product selection and shelf allocation to boost sales, improve inventory management and enhance customer experience. However, challenges like evolving demand, data accuracy and operational alignment hinder success. Older approaches struggled due to siloed tools, slow performance and poor governance. Databricks unified platform resolved these issues, enabling seamless data integration, high-performance analytics and governed sharing. The innovative AI/BI Genie interface empowered self-service analytics, driving non-technical user adoption. This solution helped Walmart cut time to value by 90% and saved $5.6M annually in FTE hours leading to increased productivity. Looking ahead, AI agents will let store managers and merchants execute decisions via conversational interfaces, streamlining operations and enhancing accessibility. This transformation positions retailers to thrive in a competitive, customer-centric market.

Sponsored by: AWS | Ripple: Well-Architected Data & AI Platforms - AWS and Databricks in Harmony

Sponsored by: AWS | Ripple: Well-Architected Data & AI Platforms - AWS and Databricks in Harmony

2025-06-11 Watch
talk
Priyanka Adhia (Ripple) , Hari Rajendran (Ripple) , Rudy Chetty (AWS)

Join us as we explore the well-architected framework for modern data lakehouse architecture, where AWS's comprehensive data, AI, and infrastructure capabilities align with Databricks' unified platform approach. Building upon core principles of Operational Excellence, Security, Reliability, Performance, and Cost Optimization, we'll demonstrate how Data and AI Governance alongside Interoperability and Usability enable organizations to build robust, scalable platforms. Learn how Ripple modernized its data infrastructure by migrating from a legacy Hadoop system to a scalable, real-time analytics platform using Databricks on AWS. This session covers the challenges of high operational costs, latency, and peak-time bottlenecks—and how Ripple achieved 80% cost savings and 55% performance improvements with Photon, Graviton, Delta Lake, and Structured Streaming.

Sponsored by: Capgemini | Unlocking Business Value With SAP Business Data Cloud and Databricks: Real-World Use Cases

2025-06-11
talk
Thorsten Leiduck (SAP) , Frank Gundlich (Capgemini)

Discover how SAP Business Data Cloud and Databricks can transform your business by unifying SAP and non-SAP data for advanced analytics and AI. In this session, we’ll highlight Optimizing Cash Flow with AI with integrated diverse data sources and AI algorithms that enable accurate cash flow forecasting to help businesses identify trends, prevent bottlenecks, and improve liquidity. You’ll also learn about the importance of high-quality, well-governed data as the foundation for reliable AI models and actionable reporting. Key Takeaways: • How to integrate and leverage SAP and external data in Databricks • Using AI for predictive analytics and better decision-making • Building a trusted data foundation to drive business performance Leave this session with actionable strategies to optimize your data, enhance efficiency, and unlock new growth opportunities.

Sponsored by: Firebolt | 10ms Queries on Iceberg: Turbocharging Your Lakehouse for Interactive Experiences with Firebolt

Sponsored by: Firebolt | 10ms Queries on Iceberg: Turbocharging Your Lakehouse for Interactive Experiences with Firebolt

2025-06-11 Watch
talk
Benjamin Wagner (Firebolt)

Open table formats such as Apache Iceberg or Delta Lake have transformed the data landscape. For the first time, we’re seeing a real open storage ecosystem emerging across database vendors. So far, open table formats have found little adoption powering low-latency, high-concurrency analytics use-cases. Data stored in open formats often gets transformed and ingested into closed systems for serving. The reason for this is simple: most modern query engines don’t properly support these workloads. In this talk we take a look under the hood of Firebolt and dive into the work we’re doing to support low-latency and high concurrency on Iceberg: caching of data and metadata, adaptive object storage reads, subresult reuse, and multi-dimensional scaling. After this session, you will know how you can build low-latency data applications on top of Iceberg. You’ll also have a deep understanding of what it takes for modern high-performance query engines to do well on these workloads.

Sponsored by: Informatica | Power Analytics and AI on Databricks With Master (Golden) Record Data

Sponsored by: Informatica | Power Analytics and AI on Databricks With Master (Golden) Record Data

2025-06-11 Watch
lightning_talk
Ajay GOLLAPALLI (Informatica)

Supercharge advanced analytics and AI insights on Databricks with accurate and consistent master data. This session explores how Informatica’s Master Data Management (MDM) integrates with Databricks to provide high-quality, integrated golden record data like customer, supplier, product 360 or reference data to support downstream analytics, Generative AI and Agentic AI. Enterprises can accelerate and de-risk the process of creating a golden record via a no-code/low-code interface, allowing data teams to quickly integrate siloed data and create a complete and consistent record that improves decision-making speed and accuracy.