This hands-on lab guides participants through the complete customer data analytics journey on Databricks, leveraging leading partner solutions - Fivetran, dbt Cloud, and Sigma. Attendees will learn how to:- Seamlessly connect to Fivetran, dbt Cloud, and Sigma using Databricks Partner Connect- Ingest data using Fivetran, transform and model data with dbt Cloud, and create interactive dashboards in Sigma, all on top of the Databricks Data Intelligence Platform- Empower teams to make faster, data-driven decisions by streamlining the entire analytics workflow using an integrated, scalable, and user-friendly platform
talk-data.com
Topic
Databricks
1286
tagged
Activity Trend
Top Events
Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.
Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.
Join industry leaders from Dow and Michelin as they reveal how data intelligence is revolutionizing sustainable manufacturing without compromising profitability. Dow demonstrates how their implementation of Databricks' Data Intelligence Platform has transformed their ability to track and reduce carbon footprints while driving operational efficiencies, resulting in significant cost savings through optimized maintenance and reduced downtime. Michelin follows with their ambitious strategy to achieve 3% energy consumption reduction by 2026, leveraging Databricks to turn this environmental challenge into operational excellence. Together, these manufacturing giants showcase how modern data architecture and AI are creating a new paradigm where sustainability and profitability go hand-in-hand.
Databricks Financial Service customers in the GenAI space have a common use case of ingestion and processing of unstructured documents — PDF/images — then performing downstream GenAI tasks such as entity extraction and RAG based knowledge Q&A. The pain points for the customers for these types of use cases are: The quality of the PDF/image documents varies since many older physical documents were scanned into electronic form The complexity of the PDF/image documents varies and many contain tables — images with embedding information — which require slower Tesseract OCR They would like to streamline postprocess for downstream workloads In this talk we will present an optimized structured streaming workflow for complex PDF ingestion. The key techniques include Apache Spark™ optimization, multi-threading, PDF object extraction, skew handling and auto retry logics
This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.
Today, executives are focused on managing regulatory scrutiny and emerging threats. Banks worldwide are leveraging the Databricks Data Intelligence Platform to enhance fraud prevention, ensure compliance and protect sensitive data while improving operational efficiency.This session will highlight how leading banks are implementing AI-driven risk management to identify vulnerabilities, streamline governance and enhance resilience. By utilizing unified data platforms, these institutions can effectively tackle threats and foster trust without hindering growth.Key takeaways: Fraud detection: Best practices for using machine learning to combat fraud Regulatory compliance: Insights on navigating complex regulations Secure operations: Strategies for scalable operations that protect assets and support growth Join us to see how data intelligence is reshaping the banking industry and enabling success in uncertain times!
Maximize the performance of your Databricks Platform with innovations on Google Cloud. Discover how Google's Arm-based Axion C4A virtual machines (VMs) deliver breakthrough price-performance and efficiency for Databricks, supercharging Databricks Photon engine. Gain actionable strategies to optimize your Databricks deployments on Google Cloud.
Think you know everything AI/BI can do? Think again. This session explores the art of the possible with Databricks AI/BI Dashboards and Genie, going beyond traditional analytics to unleash the full power of the lakehouse. From incorporating AI into dashboards to handling large-scale data with ease to delivering insights seamlessly to end users — we’ll showcase creative approaches that unlock insights and real business outcomes. Perfect for adventurous data professionals looking to push limits and think outside the box.
Eli Lilly and Company, a leading bio-pharma company, is revolutionizing manufacturing with next-gen fully digital sites. Lilly and Tredence have partnered to establish a Databricks-powered Global Manufacturing Data Fabric (GMDF), laying the groundwork for transformative data products used by various personas at sites and globally. By integrating data from various manufacturing systems into a unified data model, GMDF has delivered actionable insights across several use cases such as batch release by exception, predictive maintenance, anomaly detection, process optimization and more. Our serverless architecture leverages Databricks Auto Loader for real-time data streaming, PySpark for automation and Unity Catalog for governance, ensuring seamless data processing and optimization. This platform is the foundation for data driven processes, self-service analytics, AI and more. This session will provide details on the data architecture and strategy and share a few use cases delivered.
Join this deep dive session for practitioners on Unity Catalog, Databricks’ unified data governance solution, to explore its capabilities for managing data and AI assets across workflows. Unity Catalog provides fine-grained access control, automated lineage tracking, quality monitoring and policy enforcement and observability at scale. Whether your focus is data pipelines, analytics or machine learning and generative AI workflows, this session offers actionable insights on leveraging Unity Catalog’s open interoperability across tools and platforms to boost productivity and drive innovation. Learn governance best practices, including catalog configurations, access strategies for collaboration and controls for securing sensitive data. Additionally, discover how to design effective multi-cloud and multi-region deployments to ensure global compliance.
With regulations like LGPD (Brazil's General Data Protection Law) and GDPR, managing sensitive data access is critical. This session demonstrates how to leverage Databricks Unity Catalog system tables and data lineage to dynamically propagate classification tags, empowering organizations to monitor governance and ensure compliance. The presentation covers practical steps, including system table usage, data normalization, ingestion with Lakeflow Declarative Pipelines and classification tag propagation to downstream tables. It also explores permission monitoring with alerts to proactively address governance risks. Designed for advanced audiences, this session offers actionable strategies to strengthen data governance, prevent breaches and avoid regulatory fines while building scalable frameworks for sensitive data management.
Maximize the value of your company’s marketing efforts with Data Intelligence for Marketing. Databricks provides seamless, out-of-the-box integration with your ecosystem, empowering every marketer with self-serve insights. And with AI-driven CDP, you get a complete view of customers and campaigns.
Databricks SQL is the fastest-growing data warehouse on the market, with over 10k organizations thanks to its price performance and AI innovations. See the best practices and common architectural challenges of migrating your legacy DW, including reference architectures. Learn how to easily migrate per the recently acquired the Lakebridge migration tool, and through our partners.
Stuck on a treadmill of endless report building requests? Wondering how you can ship reliable AI products to internal users and even customers? Omni is a BI and embedded analytics platform on Databricks that lets users answer their own data questions – sometimes with a little AI help. No magic, no miracles – just smart tooling that cuts through the noise and leverages well-known concepts (semantic layer, anyone?) to improve accuracy and delight users. This talk is your blueprint for getting reliable AI use cases into production and reaching the promised land of contagious self-service.
Data sharing doesn’t have to be complicated. In this session, we’ll take a practical look at Delta Sharing in Databricks — what it is, how it works and how it fits into your organization’s data ecosystem. The focus will be on giving an overview of the different ways to share data using Databricks, from direct sharing setups to broader distribution via the Databricks Marketplace and more collaborative approaches like Clean Rooms. This talk is meant for anyone curious about modern, secure data sharing — whether you're just getting started or looking to expand your use of Databricks. Attendees will walk away with a clearer picture of what’s possible, what’s required to get started and how to choose the right sharing method for the right scenario.
Nestlé USA, a division of the world’s largest food and beverage company, Nestlé S.A., has embarked on a transformative journey to unlock GenAI capabilities on their data platform. Deloitte, Databricks, and Nestlé have collaborated on a data platform modernization program to address gaps associated with Nestlé’s existing data platform. This joint effort introduces new possibilities and capabilities, ranging from development of advanced machine learning models, implementing Unity Catalog, and adopting Lakehouse Federation, all while adhering to confidentiality protocols. With help from Deloitte and Databricks, Nestlé USA is now able to meet its advanced enterprise analytics and AI needs with the Databricks Data Intelligence Platform.
Domo's Databricks integration seamlessly connects business users to both Delta Lake data and AI/ML models, eliminating technical barriers while maximizing performance. Domo's Cloud Amplifier optimizes data processing through pushdown SQL, while the Domo AI Services layer enables anyone to leverage both traditional ML and large language models directly from Domo. During this session, we’ll explore an AI solution around fraud detection to demonstrate the power of leveraging Domo on Databricks.
In the rapidly evolving landscape of pharmaceuticals, the integration of AI and GenAI is transforming how organizations operate and deliver value. We will explore the profound impact of the AI program at Takeda Pharmaceuticals and the central role of Databricks. We will delve into eight pivotal AI/GenAI use cases that enhance operational efficiency across commercial, R&D, manufacturing, and back-office functions, including these capabilities: Responsible AI Guardrails: Scanners that validate and enforce responsible AI controls on GenAI solutions Reusable Databricks Native Vectorization Pipeline: A scalable solution enhancing data processing with quality and governance One-Click Deployable RAG Pattern: Simplifying deployment for AI applications, enabling rapid experimentation and innovation AI Asset Registry: A repository for foundational models, vector stores, and APIs, promoting reuse and collaboration
To meet the growing internal demand for accessible, reliable data, TradeStation migrated from fragmented, spreadsheet-driven workflows to a scalable, self-service analytics framework powered by Sigma on Databricks. This transition enabled business and technical users alike to interact with governed data models directly on the lakehouse, eliminating data silos and manual reporting overhead. In brokerage trading operations, the integration supports robust risk management, automates key operational workflows, and centralizes collaboration across teams. By leveraging Sigma’s intuitive interface on top of Databricks’ scalable compute and unified data architecture, TradeStation has accelerated time-to-insight, improved reporting consistency, and empowered teams to operationalize data-driven decisions at scale.