talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

509

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
Three Big Unlocks to AI Interoperability with Databricks

The ability for different AI systems to collaborate is more critical than ever. From traditional ML development to fine-tuning GenAI models, Databricks delivers the stability, cost-optimization and productivity Expedia Group (EG) needs. Learn how to unlock the full potential of AI interoperability with Databricks. AI acceleration: Discover how Databricks acts as a central hub, helping to scale AI model training and prediction generation to deliver high-quality insights for customers. Cross-platform synergy: Learn how EG seamlessly integrated Databricks' powerful features into its ecosystem, streamlining workflows and accelerating time to market. Scalable deployment: Understand how Databricks stability and reliability increased efficiency in prototyping and running scalable production workloads. Join Shiyi Pickrell to understand the future of AI interoperability, how it’s generating business value and driving the next generation of travel AI-powered experiences.

Unleash Your Content: AI-Powered Metadata for Targeting, Personalization and Brand Safety

In an era of skyrocketing content volumes, companies are sitting on huge libraries — of video, images and audio — just waiting to be leveraged to power targeted advertising and recommendations, as well as reinforce brand safety. Coactive AI will show how fast and accurate AI-driven metadata enrichment, combined with Databricks Unity Catalog and lakehouse, is accelerating and optimizing media workflows. Learn how leading brands are using content metadata to: Unlock new revenue through contextual advertising Drive personalization at scale Enhance brand safety with detailed, scene-level analysis Build unified taxonomies that fuel cross-functional insights Transform content from a static asset into a dynamic engine for growth, engagement and compliance.

Unlocking Enterprise Potential: Key Insights from P&G's Deployment of Unity Catalog at Scale

This session will explore Databricks Unity Catalog (UC) implementation by P&G to enhance data governance, reduce data redundancy and improve the developer experience through the enablement of a Lakehouse architecture. The presentation will cover: The distinction between data treated as a product and standard application data, highlighting how UC's structure maximizes the value of data in P&G's data lake. Real-life examples from two years of using Unity Catalog, demonstrating benefits such as improved governance, reduced waste and enhanced data discovery. Challenges related to disaster recovery and external data access, along with our collaboration with Databricks to address these issues. Sharing our experience can provide valuable insights for organizations planning to adopt Unity Catalog on an enterprise scale.

Adobe’s Security Lakehouse: OCSF, Data Efficiency and Threat Detection at Scale

This session will explore how Adobe uses a sophisticated data security architecture built on the Databricks Data Intelligence Platform, along with the Open Cybersecurity Schema Framework (OCSF), to enable scalable, real-time threat detection across more than 10 PB of security data. We’ll compare different approaches to OCSF implementation and demonstrate how Adobe processes massive security datasets efficiently — reducing query times by 18%, maintaining 99.4% SLA compliance, and supporting 286 security users across 17 teams with over 4,500 daily queries. By using Databricks' Platform for serverless compute, scalable architecture, and LLM-powered recommendations, Adobe has significantly improved processing speed and efficiency, resulting in substantial cost savings. We’ll also highlight how OCSF enables advanced cross-tool analytics and automation, streamlining investigations. Finally, we’ll introduce Databricks’ new open-source OCSF toolkit for scalable security data normalization and invite the community to contribute.

Build AI-Powered Applications Natively on Databricks

Discover how to build and deploy AI-powered applications natively on the Databricks Data Intelligence Platform. This session introduces best practices and a standard reference architecture for developing production-ready apps using popular frameworks like Dash, Shiny, Gradio, Streamlit and Flask. Learn how to leverage agents for orchestration and explore primary use cases supported by Databricks Apps, including data visualization, AI applications, self-service analytics and data quality monitoring. With serverless deployment and built-in governance through Unity Catalog, Databricks Apps enables seamless integration with your data and AI models, allowing you to focus on delivering impactful solutions without the complexities of infrastructure management. Whether you're a data engineer or an app developer, this session will equip you with the knowledge to create secure, scalable and efficient applications within a Databricks environment.

Data Strategy in Motion: What Successful Organizations Get Right

Join Robin Sutara, Field CDO for Databricks, as she discusses creating a robust data strategy for organizational change in an ecosystem that is under constant transformation. Attendees will learn best practices from Databricks customers for successful data strategy, including business alignment, people and culture, democratization, governance, and measurement as vital strategic aspects. Understanding these elements will help you drive more data and AI transformation success within your organization.

Delta Sharing in Action: Architecture and Best Practices

Delta Sharing is revolutionizing how enterprises share live data and AI assets securely, openly and at scale. As the industry’s first open data-sharing protocol, it empowers organizations to collaborate seamlessly across platforms and with any partner, whether inside or outside the Databricks ecosystem. In this deep-dive session, you’ll learn best practices and real-world use cases that show how Delta Sharing helps accelerate collaboration and fuel AI-driven innovation. We’ll also unveil the latest advancements, including: Managed network configurations for easier, secure setup OIDC identity federation for trusted, open sharing Expanded asset types including dynamic views, materialized views, federated tables, read clones and more Whether you’re a data engineer, architect, or data leader, you’ll leave with practical strategies to future-proof your data-sharing architecture. Don’t miss the live demos, expert guidance and an exclusive look at what’s next in data collaboration.

This hands-on lab guides participants through the complete customer data analytics journey on Databricks, leveraging leading partner solutions - Fivetran, dbt Cloud, and Sigma. Attendees will learn how to:- Seamlessly connect to Fivetran, dbt Cloud, and Sigma using Databricks Partner Connect- Ingest data using Fivetran, transform and model data with dbt Cloud, and create interactive dashboards in Sigma, all on top of the Databricks Data Intelligence Platform- Empower teams to make faster, data-driven decisions by streamlining the entire analytics workflow using an integrated, scalable, and user-friendly platform

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.

Manufacturing Cleaner: How Data Intelligence Cuts Carbon, Not Profits

Join industry leaders from Dow and Michelin as they reveal how data intelligence is revolutionizing sustainable manufacturing without compromising profitability. Dow demonstrates how their implementation of Databricks' Data Intelligence Platform has transformed their ability to track and reduce carbon footprints while driving operational efficiencies, resulting in significant cost savings through optimized maintenance and reduced downtime. Michelin follows with their ambitious strategy to achieve 3% energy consumption reduction by 2026, leveraging Databricks to turn this environmental challenge into operational excellence. Together, these manufacturing giants showcase how modern data architecture and AI are creating a new paradigm where sustainability and profitability go hand-in-hand.

PDF Document Ingestion Accelerator for GenAI Applications

Databricks Financial Service customers in the GenAI space have a common use case of ingestion and processing of unstructured documents — PDF/images — then performing downstream GenAI tasks such as entity extraction and RAG based knowledge Q&A. The pain points for the customers for these types of use cases are: The quality of the PDF/image documents varies since many older physical documents were scanned into electronic form The complexity of the PDF/image documents varies and many contain tables — images with embedding information — which require slower Tesseract OCR They would like to streamline postprocess for downstream workloads In this talk we will present an optimized structured streaming workflow for complex PDF ingestion. The key techniques include Apache Spark™ optimization, multi-threading, PDF object extraction, skew handling and auto retry logics

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.

Securing the Future: How Banks are Reducing Risk With Data and AI
talk
by Nitin Kulkarni (Nationwide Building SOCIETY) , Gordon Wilson (Sumitomo Mitsui Banking Corporation) , Thomas Sawyer (Sumitomo Mitsui Banking Corp.) , Cyril Cymbler (Databricks)

Today, executives are focused on managing regulatory scrutiny and emerging threats. Banks worldwide are leveraging the Databricks Data Intelligence Platform to enhance fraud prevention, ensure compliance and protect sensitive data while improving operational efficiency.This session will highlight how leading banks are implementing AI-driven risk management to identify vulnerabilities, streamline governance and enhance resilience. By utilizing unified data platforms, these institutions can effectively tackle threats and foster trust without hindering growth.Key takeaways: Fraud detection: Best practices for using machine learning to combat fraud Regulatory compliance: Insights on navigating complex regulations Secure operations: Strategies for scalable operations that protect assets and support growth Join us to see how data intelligence is reshaping the banking industry and enabling success in uncertain times!

Sponsored by: Google Cloud | Unlock price-performance and efficiency on Google Cloud: Databricks & Axion in Action

Maximize the performance of your Databricks Platform with innovations on Google Cloud. Discover how Google's Arm-based Axion C4A virtual machines (VMs) deliver breakthrough price-performance and efficiency for Databricks, supercharging Databricks Photon engine. Gain actionable strategies to optimize your Databricks deployments on Google Cloud.

Take it to the Limit: Art of the Possible in AI/BI

Think you know everything AI/BI can do? Think again. This session explores the art of the possible with Databricks AI/BI Dashboards and Genie, going beyond traditional analytics to unleash the full power of the lakehouse. From incorporating AI into dashboards to handling large-scale data with ease to delivering insights seamlessly to end users — we’ll showcase creative approaches that unlock insights and real business outcomes. Perfect for adventurous data professionals looking to push limits and think outside the box.

Transforming Bio-Pharma Manufacturing: Eli Lilly's Data-Driven Journey With Databricks

Eli Lilly and Company, a leading bio-pharma company, is revolutionizing manufacturing with next-gen fully digital sites. Lilly and Tredence have partnered to establish a Databricks-powered Global Manufacturing Data Fabric (GMDF), laying the groundwork for transformative data products used by various personas at sites and globally. By integrating data from various manufacturing systems into a unified data model, GMDF has delivered actionable insights across several use cases such as batch release by exception, predictive maintenance, anomaly detection, process optimization and more. Our serverless architecture leverages Databricks Auto Loader for real-time data streaming, PySpark for automation and Unity Catalog for governance, ensuring seamless data processing and optimization. This platform is the foundation for data driven processes, self-service analytics, AI and more. This session will provide details on the data architecture and strategy and share a few use cases delivered.

Unity Catalog Deep Dive: Practitioner's Guide to Best Practices and Patterns

Join this deep dive session for practitioners on Unity Catalog, Databricks’ unified data governance solution, to explore its capabilities for managing data and AI assets across workflows. Unity Catalog provides fine-grained access control, automated lineage tracking, quality monitoring and policy enforcement and observability at scale. Whether your focus is data pipelines, analytics or machine learning and generative AI workflows, this session offers actionable insights on leveraging Unity Catalog’s open interoperability across tools and platforms to boost productivity and drive innovation. Learn governance best practices, including catalog configurations, access strategies for collaboration and controls for securing sensitive data. Additionally, discover how to design effective multi-cloud and multi-region deployments to ensure global compliance.

Unleashing Data Governance at iFood:Harnessing System Tables and Lineage for Dynamic Tag Propagation

With regulations like LGPD (Brazil's General Data Protection Law) and GDPR, managing sensitive data access is critical. This session demonstrates how to leverage Databricks Unity Catalog system tables and data lineage to dynamically propagate classification tags, empowering organizations to monitor governance and ensure compliance. The presentation covers practical steps, including system table usage, data normalization, ingestion with Lakeflow Declarative Pipelines and classification tag propagation to downstream tables. It also explores permission monitoring with alerts to proactively address governance risks. Designed for advanced audiences, this session offers actionable strategies to strengthen data governance, prevent breaches and avoid regulatory fines while building scalable frameworks for sensitive data management.