talk-data.com talk-data.com

Topic

Data Lakehouse

data_architecture data_warehouse data_lake

489

tagged

Activity Trend

118 peak/qtr
2020-Q1 2026-Q1

Activities

489 activities · Newest first

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses.Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

Data is the backbone of modern decision-making, but centralizing it is only the tip of the iceberg. Entitlements, secure sharing and just-in-time availability are critical challenges to any large-scale platform. Join Goldman Sachs as we reveal how our Legend Lakehouse, coupled with Databricks, overcomes these hurdles to deliver high-quality, governed data at scale. By leveraging an open table format (Apache Iceberg) and open catalog format (Unity Catalog), we ensure platform interoperability and vendor neutrality. Databricks Unity Catalog then provides a robust entitlement system that aligns with our data contracts, ensuring consistent access control across producer and consumer workspaces. Finally, Legend functions, integrating with Databricks User Defined Functions (UDF), offer real-time data enrichment and secure transformations without exposing raw datasets. Discover how these components unite to streamline analytics, bolster governance and power innovation.

Take it to the Limit: Art of the Possible in AI/BI

Think you know everything AI/BI can do? Think again. This session explores the art of the possible with Databricks AI/BI Dashboards and Genie, going beyond traditional analytics to unleash the full power of the lakehouse. From incorporating AI into dashboards to handling large-scale data with ease to delivering insights seamlessly to end users — we’ll showcase creative approaches that unlock insights and real business outcomes. Perfect for adventurous data professionals looking to push limits and think outside the box.

Sponsored by: Deloitte | Transforming Nestlé USA’s (NUSA) data platform to unlock new analytics and GenAI capabilities

Nestlé USA, a division of the world’s largest food and beverage company, Nestlé S.A., has embarked on a transformative journey to unlock GenAI capabilities on their data platform. Deloitte, Databricks, and Nestlé have collaborated on a data platform modernization program to address gaps associated with Nestlé’s existing data platform. This joint effort introduces new possibilities and capabilities, ranging from development of advanced machine learning models, implementing Unity Catalog, and adopting Lakehouse Federation, all while adhering to confidentiality protocols. With help from Deloitte and Databricks, Nestlé USA is now able to meet its advanced enterprise analytics and AI needs with the Databricks Data Intelligence Platform.

Sponsored by: Sigma | Trading Spreadsheets for Speed: TradeStation’s Self-Service Revolution

To meet the growing internal demand for accessible, reliable data, TradeStation migrated from fragmented, spreadsheet-driven workflows to a scalable, self-service analytics framework powered by Sigma on Databricks. This transition enabled business and technical users alike to interact with governed data models directly on the lakehouse, eliminating data silos and manual reporting overhead. In brokerage trading operations, the integration supports robust risk management, automates key operational workflows, and centralizes collaboration across teams. By leveraging Sigma’s intuitive interface on top of Databricks’ scalable compute and unified data architecture, TradeStation has accelerated time-to-insight, improved reporting consistency, and empowered teams to operationalize data-driven decisions at scale.

Transforming Data at Rheem: From Silos to Scalable Data Lakehouse With Databricks and Unity Catalog

Rheem's journey from a fragmented data landscape to a robust, scalable data platform powered by Databricks showcases the power of data modernization. In just 1.5 years, Rheem evolved from siloed reporting to 30+ certified data products, integrated with 20+ source systems, including MDM. This transformation has unlocked significant business value across sales, procurement, service and operations, enhancing decision-making and operational efficiency. This session will delve into Rheem's implementation of Databricks, highlighting how it has become the cornerstone of rapid data product development and efficient data sharing across the organization. We will also explore the upcoming enhancements with Unity Catalog, including the full migration from HMS to UC. Attendees will gain insights into best practices for building a centralized data platform, enhancing developer experience, improving governance capabilities as well as tips and tricks for a successful UC migration and enablement.

Busting Data Modeling Myths: Truths and Best Practices for Data Modeling in the Lakehouse

Unlock the truth behind data modeling in Databricks. This session will tackle the top 10 myths surrounding relational and dimensional data modeling. Attendees will gain a clear understanding of what Databricks Lakehouse truly supports today, including how to leverage primary and foreign keys, identity columns for surrogate keys, column-level data quality constraints and much more. This session will talk through the lens of medallion architecture, explaining how to implement data models across bronze, silver, and gold tables. Whether you’re migrating from a legacy warehouse or building new analytics solutions, you’ll leave equipped to fully leverage Databricks’ capabilities, and design scalable, high-performance data models for enterprise analytics.

Lakehouse to Powerhouse: Reckitt's Enterprise AI Transformation Story

In this presentation, we showcase Reckitt’s journey to develop and implement a state-of-the-art Gen AI platform, designed to transform enterprise operations starting with the marketing function. We will explore the unique technical challenges encountered and the innovative architectural solutions employed to overcome them. Attendees will gain insights into how cutting-edge Gen AI technologies were integrated to meet Reckitt’s specific needs. This session will not only highlight the transformative impacts on Reckitt’s marketing operations but also serve as a blueprint for AI-driven innovation in the Consumer Goods sector, demonstrating a successful model of partnership in technology and business transformation.

Multi-Format, Multi-Table, Multi-Statement Transactions on Unity Catalog

Get a first look at multi-statement transactions in Databricks. In this session, we will dive into their capabilities, exploring how multi-statement transactions enable atomic updates across multiple tables in your data pipelines, ensuring data consistency and integrity for complex operations. We will also share how we are enabling unified transactions across Delta Lake and Iceberg with Unity Catalog — powering our vision for an open and interoperable lakehouse.

Tech Industry Session: Optimizing Costs and Controls to Democratize Data and AI

Join us for this session focused on how leading tech companies are enabling data intelligence across their organizations while maintaining cost efficiency and governance. Hear the successes and the challenges when Databricks empowers thousands of users—from engineers to business teams—by providing scalable tools for AI, BI and analytics. Topics include: Combining AI/BI and Lakehouse Apps to streamline workflows and accelerate insights Implementing systems tables, tagging and governance frameworks for granular control Democratizing data access while optimizing costs for large-scale analytical workloads Hear from customers and Databricks experts, followed by a customer panel featuring industry leaders. Gain insights into how Databricks helps tech innovators scale their platforms while maintaining operational excellence.

Accelerating Data Transformation: Best Practices for Governance, Agility and Innovation

In this session, we will share NCS’s approach to implementing a Databricks Lakehouse architecture, focusing on key lessons learned and best practices from our recent implementations. By integrating Databricks SQL Warehouse, the DBT Transform framework and our innovative test automation framework, we’ve optimized performance and scalability, while ensuring data quality. We’ll dive into how Unity Catalog enabled robust data governance, empowering business units with self-serve analytical workspaces to create insights while maintaining control. Through the use of solution accelerators, rapid environment deployment and pattern-driven ELT frameworks, we’ve fast-tracked time-to-value and fostered a culture of innovation. Attendees will gain valuable insights into accelerating data transformation, governance and scaling analytics with Databricks.

Defending Revenue With GenAI

Defending revenue is critical to any business strategy, and predicting customer churn is difficult. Until now. In this session, Blueprint will share how their clients use GenAI on Databricks to reduce customer churn, grow average revenue per user, and create overall revenue growth. This presentation will demonstrate how they helped a customer take a GenAI-powered personalization engine from proof-of-concept to production to improve customer churn propensity, personalized retention, and customer satisfaction. Learn how to turn your lakehouse from a cost center into a profit center.

Morgan Stanley, a highly regulated financial institution, needs to meet stringent security and regulatory requirements around data storage and processing. Traditionally, this has necessitated maintaining control over data and compute within their own accounts with the associated management overhead. In this session, we will cover how Morgan Stanley has partnered with Databricks on a fully-managed compute and storage solution that allows them to meet their regulatory obligations with significantly reduced effort. This innovative approach enables rapid onboarding of new projects onto the platform, improving operational efficiency while maintaining the highest levels of security and compliance.

How to Build an Open Lakehouse: Best Practices for Interoperability

Building an open data lakehouse? Start with the right blueprint. This session walks through common reference architectures for interoperable lakehouse deployments across AWS, Google Cloud, Azure and tools like Snowflake, BigQuery and Microsoft Fabric. Learn how to design for cross-platform data access, unify governance with Unity Catalog and ensure your stack is future-ready — no matter where your data lives.

Lakebase: Fully Managed Postgres for the Lakehouse

Lakebase is a new Postgres-compatible OLTP database designed to support intelligent applications. Lakebase eliminates custom ETL pipelines with built-in lakehouse table synchronization, supports sub-10ms latency for high-throughput workloads, and offers full Postgres compatibility, so you can build applications more quickly.In this session, you’ll learn how Lakebase enables faster development, production-level concurrency, and simpler operations for data engineers and application developers building modern, data-driven applications. We'll walk through key capabilities, example use cases, and how Lakebase simplifies infrastructure while unlocking new possibilities for AI and analytics.

Sponsored by: AWS | Ripple: Well-Architected Data & AI Platforms - AWS and Databricks in Harmony

Join us as we explore the well-architected framework for modern data lakehouse architecture, where AWS's comprehensive data, AI, and infrastructure capabilities align with Databricks' unified platform approach. Building upon core principles of Operational Excellence, Security, Reliability, Performance, and Cost Optimization, we'll demonstrate how Data and AI Governance alongside Interoperability and Usability enable organizations to build robust, scalable platforms. Learn how Ripple modernized its data infrastructure by migrating from a legacy Hadoop system to a scalable, real-time analytics platform using Databricks on AWS. This session covers the challenges of high operational costs, latency, and peak-time bottlenecks—and how Ripple achieved 80% cost savings and 55% performance improvements with Photon, Graviton, Delta Lake, and Structured Streaming.

Sponsored by: Firebolt | 10ms Queries on Iceberg: Turbocharging Your Lakehouse for Interactive Experiences with Firebolt

Open table formats such as Apache Iceberg or Delta Lake have transformed the data landscape. For the first time, we’re seeing a real open storage ecosystem emerging across database vendors. So far, open table formats have found little adoption powering low-latency, high-concurrency analytics use-cases. Data stored in open formats often gets transformed and ingested into closed systems for serving. The reason for this is simple: most modern query engines don’t properly support these workloads. In this talk we take a look under the hood of Firebolt and dive into the work we’re doing to support low-latency and high concurrency on Iceberg: caching of data and metadata, adaptive object storage reads, subresult reuse, and multi-dimensional scaling. After this session, you will know how you can build low-latency data applications on top of Iceberg. You’ll also have a deep understanding of what it takes for modern high-performance query engines to do well on these workloads.

Sponsored by: Fivetran | Scalable Data Ingestion: Building custom pipelines with the Fivetran Connector SDK and Databricks

Organizations have hundreds of data sources, some of which are very niche or difficult to access. Incorporating this data into your lakehouse requires significant time and resources, hindering your ability to work on more value-add projects. Enter the Fivetran Connector SDK- a powerful new tool that enables your team to create custom pipelines for niche systems, custom APIs, and sources with specific data filtering requirements, seamlessly integrating with Databricks. During this session, Fivetran will demonstrate how to (1) Leverage the Connector SDK to build scalable connectors, enabling the ingestion of diverse data into Databricks (2) Gain flexibility and control over historical and incremental syncs, delete capture, state management, multithreading data extraction, and custom schemas (3) Utilize practical examples, code snippets, and architectural considerations to overcome data integration challenges and unlock the full potential of your Databricks environment.

The Missing Link Between the Lakehouse and Data Intelligence

What connects your lakehouse to real data intelligence? The answer: the catalog. But not just any catalog. In this session, we break down why Unity Catalog is purpose-built for the lakehouse, and how it goes beyond operational or business catalogs to deliver cross-platform interoperability and a shared understanding of your data. You’ll walk away with a clear view of how the right data foundation unlocks smarter decisions and trusted AI.