talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

1286

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

1286 activities · Newest first

Transforming Data at Rheem: From Silos to Scalable Data Lakehouse With Databricks and Unity Catalog

Rheem's journey from a fragmented data landscape to a robust, scalable data platform powered by Databricks showcases the power of data modernization. In just 1.5 years, Rheem evolved from siloed reporting to 30+ certified data products, integrated with 20+ source systems, including MDM. This transformation has unlocked significant business value across sales, procurement, service and operations, enhancing decision-making and operational efficiency. This session will delve into Rheem's implementation of Databricks, highlighting how it has become the cornerstone of rapid data product development and efficient data sharing across the organization. We will also explore the upcoming enhancements with Unity Catalog, including the full migration from HMS to UC. Attendees will gain insights into best practices for building a centralized data platform, enhancing developer experience, improving governance capabilities as well as tips and tricks for a successful UC migration and enablement.

Unifying Customer Data to Drive a New Automotive Experience With Lakeflow Connect

The Databricks Data Intelligence Platform and Lakeflow Connect have transformed how Porsche manages and uses its customer data. By opting to use Lakeflow Connect instead of building a custom solution, the company has reaped the benefits of both operational efficiency and cost management. Internally, teams at Porsche now spend less time managing data integration processes. “Lakeflow Connect has enabled our dedicated CRM and Data Science teams to be more productive as they can now focus on their core work to help innovate, instead of spending valuable time on the data ingestion integration with Salesforce,” says Gruber. This shift in focus is aligned with broader industry trends, where automotive companies are redirecting significant portions of their IT budgets toward customer experience innovations and digital transformation initiatives. This story was also shared as part of a Databricks Success Story — Elise Georis, Giselle Goicochea.

Unity Catalog Implementation & Evolution at Edward Jones

This presentation outlines the evolution of Databricks and its integration with cloud analytics at Edward Jones. It focuses on the transition from Cloud V1.x to Cloud V2.0, which highlights the challenges faced with initial setup, Unity Catalog implementation and the improvements planned for the future particularly in terms of Data Cataloging, Architecture and Disaster Recovery. Highlights: Cloud Analytics Journey Current Setup (Cloud V1.x) Utilizes Medallion architecture customized to Edward Jones need. Challenges & limitations identified with integration, limited catalogs, Disaster Recovery etc. Cloud V2.0 Enhancements Modifications in storage and compute in Medallion layers Next level integration with enterprise suites Disaster Recovery readiness Future outlook

Breaking Silos: Using SAP Business Data Cloud and Delta Sharing for Seamless Access to SAP Data in Databricks

We’re excited to share with you how SAP Business Data Cloud supports Delta Sharing to share SAP data securely and seamlessly with Databricks—no complex ETL or data duplication required. This enables organizations to securely share SAP data for analytics and AI in Databricks while also supporting bidirectional data sharing back to SAP.In this session, we’ll demonstrate the integration in action, followed by a discussion of how the global beauty group, Natura, will leverage this solution. Whether you’re looking to bring SAP data into Databricks for advanced analytics or build AI models on top of trusted SAP datasets, this session will show you how to get started — securely and efficiently.

Busting Data Modeling Myths: Truths and Best Practices for Data Modeling in the Lakehouse

Unlock the truth behind data modeling in Databricks. This session will tackle the top 10 myths surrounding relational and dimensional data modeling. Attendees will gain a clear understanding of what Databricks Lakehouse truly supports today, including how to leverage primary and foreign keys, identity columns for surrogate keys, column-level data quality constraints and much more. This session will talk through the lens of medallion architecture, explaining how to implement data models across bronze, silver, and gold tables. Whether you’re migrating from a legacy warehouse or building new analytics solutions, you’ll leave equipped to fully leverage Databricks’ capabilities, and design scalable, high-performance data models for enterprise analytics.

Data Intelligence on Unity Catalog Managed Tables Powered by Predictive Optimization

In this session, we’ll explore the data intelligence capabilities within Databricks, focusing on Predictive Optimization. This feature enhances the performance of Unity Catalog managed tables by automatically optimizing data layouts, resulting in improved query performance and reduced storage costs. You’ll learn how Predictive Optimization works and see real-world examples of customers using it to fully automate data layout management. We’ll also share a preview of the exciting features and enhancements coming down the road.

GenAI for SQL & ETL: Build Multimodal AI Workflows at Scale

Enterprises generate massive amounts of unstructured data — from support tickets and PDFs to emails and product images. But extracting insight from that data requires brittle pipelines and complex tools. Databricks AI Functions make this simpler. In this session, you’ll learn how to apply powerful language and vision models directly within your SQL and ETL workflows — no endpoints, no infrastructure, no rewrites. We’ll explore practical use cases and best practices for analyzing complex documents, classifying issues, translating content, and inspecting images — all in a way that’s scalable, declarative, and secure. What you’ll learn: How to run state-of-the-art LLMs like GPT-4, Claude Sonnet 4, and Llama 4 on your data How to build scalable, multimodal ETL workflows for text and images Best practices for prompts, cost, and error handling in production Real-world examples of GenAI use cases powered by AI Functions

How Arctic Wolf Modernizes Cloud Security and Enhances Threat Detection with Databricks

In this session, you’ll gain actionable insights to modernize your security operations and strengthen cyber resilience. Arctic Wolf will highlight how they eliminated data silos & enhanced their MDR pipeline to investigate suspicious threat actors for customers using Databricks.

How Blue Origin Accelerates Innovation With Databricks and AWS GovCloud

Blue Origin is revolutionizing space exploration with a mission-critical data strategy powered by Databricks on AWS GovCloud. Learn how they leverage Databricks to meet ITAR and FedRAMP High compliance, streamline manufacturing and accelerate their vision of a 24/7 factory. Key use cases include predictive maintenance, real-time IoT insights and AI-driven tools that transform CAD designs into factory instructions. Discover how Delta Lake, Structured Streaming and advanced Databricks functionalities like Unity Catalog enable real-time analytics and future-ready infrastructure, helping Blue Origin stay ahead in the race to adopt generative AI and serverless solutions.

How Feastables Partners With Engine to Leverage Advanced Data Models and AI for Smarter BI

Feastables, founded by YouTube sensation MrBeast, partnered with Engine to build a modern, AI-enabled BI ecosystem that transforms complex, disparate data into actionable insights, driving smarter decision-making across the organization. In this session, learn how Engine, a Built-On Databricks Partner, brought expertise combined with strategic partnerships that enabled Feastables to rapidly stand up a secure, modern data estate to unify complex internal and external data sources into a single, permissioned analytics platform. Feastables unlocked the power of cross-functional collaboration by democratizing data access throughout their enterprise and seamlessly integrating financial, retailer, supply chain, syndicated, merchandising and e-commerce data. Discover how a scalable analytics framework combined with advanced AI models and tools empower teams with Smarter BI across sales, marketing, supply chain, finance and executive leadership to enable real-time decision-making at scale.

How FedEx Achieved Self-Serve Analytics and Data Democratization on Databricks

FedEx, a global leader in transportation and logistics, faced a common challenge in the era of big data: how to democratize data and foster data-driven decision making with thousands of data practitioners at FedEx wanting to build models, get real-time insights, explore enterprise data, and build enterprise-grade solutions to run the business. This breakout session will highlight how FedEx overcame challenges in data governance and security using Unity Catalog, ensuring that sensitive information remains protected while still allowing appropriate access across the organization. We'll share their approach to building intuitive self-service interfaces, including the use of natural-language processing to enable non-technical users to query data effortlessly. The tangible outcomes of this initiative are numerous, but chiefly: increased data literacy across the company, faster time-to-insight for business decisions, and significant cost-savings through improved operational efficiency.

How to Migrate from Teradata to Databricks SQL

Storage and processing costs of your legacy Teradata data warehouses impact your ability to deliver. Migrating your legacy Teradata data warehouse to the Databricks Data Intelligence Platform can accelerate your data modernization journey. In this session, learn the top strategies for completing this data migration. We will cover data type conversion, basic to complex code conversions, validation and reconciliation best practices. How to use Databricks natively hosted LLMs to assist with migration activities. See before-and-after architectures of customers who have migrated, and learn about the benefits they realized.

How We Turned 200+ Business Users Into Analysts With AI/BI Genie

AI/BI Genie has transformed self-service analytics for the Databricks Marketing team. This user-friendly conversational AI tool empowers marketers to perform advanced data analysis using natural language — no SQL required. By reducing reliance on data teams, Genie increases productivity and enables faster, data-driven decisions across the organization. But realizing Genie’s full potential takes more than just turning it on. In this session, we’ll share the end-to-end journey of implementing Genie for over 200 marketing users, including lessons learned, best practices and the real business impact of this Databricks-on-Databricks solution. Learn how Genie democratizes data access, enhances insight generation and streamlines decision-making at scale.

Intelligent Document Processing: Building AI, BI, and Analytics Systems on Unstructured Data

Most enterprise data is trapped in unstructured formats — documents, PDFs, scanned images and tables — making it difficult to access, analyze and use. This session shows how to unlock that hidden value by building intelligent document processing workflows on the Databricks Data Intelligence Platform. You’ll learn how to ingest unstructured content using Lakeflow Connect, extract structured data with AI Parse — even from complex tables and scanned documents — and apply analytics or AI to this newly structured data. What you’ll learn: How to build scalable pipelines that transform unstructured documents into structured tables Techniques for automating document workflows with Databricks tools Strategies for maintaining quality and governance with Unity Catalog Real-world examples of AI applications built with intelligent document processing

Introducing Lakeflow: The Future of Data Engineering on Databricks

Join us to explore Lakeflow, Databricks' end-to-end solution for simplifying and unifying the most complex data engineering workflows. This session builds on keynote announcements, offering an accessible introduction for newcomers while emphasizing the transformative value Lakeflow delivers.We’ll cover: What is Lakeflow? – A cohesive overview of its components: Lakeflow Connect, Lakeflow Declarative Pipelines, and Lakeflow Jobs. Core Capabilities in Action – Live demos showcasing no-code data ingestion, code-optional declarative pipelines, and unified, end-to-end orchestration. Vision for the Future – Unveil the roadmap, introducing no-code and open-source initiatives. Discover how Lakeflow equips data teams with a seamless experience for ingestion, transformation, and orchestration, reducing complexity and driving productivity. By unifying these capabilities, Lakeflow lays the groundwork for scalable, reliable, efficient data pipelines in a governed and high-performing environment.

Managing Databricks at Scale

T-Mobile’s leadership in 5G innovation and its rapid growth in the fixed wireless business have led to an exponential increase in data, reaching 100s of terabytes daily. This session explores how T-Mobile uses Databricks to manage this data efficiently, focusing on scalable architecture with Delta Lake, auto-scaling clusters, performance optimization through data partitioning and caching and comprehensive data governance with Unity Catalog. Additionally, it covers cost management, collaborative tools and AI-driven productivity tools, highlighting how these strategies empower T-Mobile to innovate, streamline operations and maximize data impact across network optimization, supporting the community, energy management and more.

Multi-Format, Multi-Table, Multi-Statement Transactions on Unity Catalog

Get a first look at multi-statement transactions in Databricks. In this session, we will dive into their capabilities, exploring how multi-statement transactions enable atomic updates across multiple tables in your data pipelines, ensuring data consistency and integrity for complex operations. We will also share how we are enabling unified transactions across Delta Lake and Iceberg with Unity Catalog — powering our vision for an open and interoperable lakehouse.

Scaling Generative AI: Batch Inference Strategies for Foundation Models

Curious how to apply resource-intensive generative AI models across massive datasets without breaking the bank? This session reveals efficient batch inference strategies for foundation models on Databricks. Learn how to architect scalable pipelines that process large volumes of data through LLMs, text-to-image models and other generative AI systems while optimizing for throughput, cost and quality. Key takeaways: Implementing efficient batch processing patterns for foundation models using AI functions Optimizing token usage and prompt engineering for high-volume inference Balancing compute resources between CPU preprocessing and GPU inference Techniques for parallel processing and chunking large datasets through generative models Managing model weights and memory requirements across distributed inference tasks You'll discover how to process any scale of data through your generative AI models efficiently.

Serverless as the New "Easy Button": How HP Inc. Used Serverless to Turbocharge Their Data Pipeline

How do you wrangle over 8TB of granular “hit-level” website analytics data with hundreds of columns, all while eliminating the overhead of cluster management, decreasing runtime and saving money? In this session, we’ll dive into how we helped HP Inc. use Databricks serverless compute and Lakeflow Declarative Pipelines to streamline Adobe Analytics data ingestion while making it faster, cheaper and easier to operate. We’ll walk you through our full migration story — from managing unwieldy custom-defined AWS-based Apache Spark™ clusters to spinning up Databricks serverless pipelines and workflows with on-demand scalability and near-zero overhead. If you want to simplify infrastructure, optimize performance and get more out of your Databricks workloads, this session is for you.

Smashing Silos, Shaping the Future: Data for All in the Next-Gen Ecosystem

A successful data strategy requires the right platform and the ability to empower the broader user community by creating simple, scalable and secure patterns that lower the barrier to entry while ensuring robust data practices. Guided by the belief that everyone is a data person, we focus on breaking down silos, democratizing access and enabling distributed teams to contribute through a federated "data-as-a-product" model. We’ll share the impact and lessons learned in creating a single source of truth on Unity Catalog, consolidated from diverse sources and cloud platforms. We’ll discuss how we streamlined governance with Databricks Apps, Workflows and native capabilities, ensuring compliance without hindering innovation. We’ll also cover how we maximize the value of that catalog by leveraging semantics to enable trustworthy, AI-driven self-service in AI/BI dashboards and downstream apps. Come learn how we built a next-gen data ecosystem that empowers everyone to be a data person.