talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

1286

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

1286 activities · Newest first

Bridging Ontologies & Lakehouses: Palantir AIP + Databricks for Secure Autonomous AI

AI is moving from pilots to production, but many organizations still struggle to connect boardroom ambitions with operational reality. Palantir’s Artificial Intelligence Platform (AIP) and the Databricks Data Intelligence Platform now form a single, open architecture that closes this gap by pairing Palantir’s operational decision empowering Ontology- with Databricks’ industry-leading scale, governance and Lakehouse economics. The result: real-time, AI-powered, autonomous workflows that are already powering mission-critical outcomes for the U.S. Department of Defense, bp and other joint customers across the public and private sectors. In this technically grounded but business-focused session you will see the new reference architecture in action. We will walk through how Unity Catalog and Palantir Virtual Tables provide governed, zero-copy access to Lakehouse data and back mission-critical operational workflows on top of Palantir’s semantic ontology and agentic AI capabilities. We will also explore how Palantir’s no-code and pro-code tooling integrates with Databricks compute to orchestrate builds and write tables to Unity Catalog. Come hear from customers currently using this architecture to drive critical business outcomes seamlessly across Databricks and Palantir.

Databricks Apps: Turning Data and AI Into Practical, User-Friendly Applications

This session is repeated. In this session, we present an overview of the GA release of Databricks Apps, the new app hosting platform that integrates all the Databricks services necessary to build production-ready data and AI applications. With Apps, data and developer teams can build new interfaces into the data intelligence platform, further democratizing the transformative power of data and AI across the organization. We'll cover common use cases, including RAG chat apps, interactive visualizations and custom workflow builders, as well as look at several best practices and design patterns when building apps. Finally, we'll look ahead with the vision, strategy and roadmap for the year ahead.

Enabling Sleep Science Research With Databricks and Delta Sharing

Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

In this session, we will explore the Australian Red Cross Lifeblood's approach to synchronizing an Azure SQL Datavault 2.0 (DV2.0) implementation with Unity Catalog (UC) using Lakeflow Connect. Lifeblood's DV2.0 data warehouse, which includes raw vault (RV) and business vault (BV) tables, as well as information marts defined as views, required a multi-step process to achieve data/business logic sync with UC. This involved using Lakeflow Connect to ingest RV and BV data, followed by a custom process utilizing JDBC to ingest view definitions, and the automated/manual conversion of T-SQL to Databricks SQL views, with Lakehouse Monitoring for validation. In this talk, we will share our journey, the design decisions we made, and how the resulting solution now supports analytics workloads, analysts, and data scientists at Lifeblood.

How United Airlines Transforms SWIM Data into Real-Time Operational Insight and Faster Decision Making

Discover how United Airlines, in collaboration with Databricks and Impetus Technologies, has built a next-generation data intelligence platform leveraging System Wide Information Management (SWIM) to deliver mission-critical, real-time insights for flight disruption prediction, situational analysis, and smarter, faster decision-making. In this session, United Airlines experts will share how their Databricks-based SWIM architecture enables near real-time operational awareness, enhances responsiveness during irregular operations (IRROPs), and drives proactive actions to minimize disruptions. They will also discuss how United efficiently processes and manages the large volume and variety of SWIM data, ensuring seamless integration and actionable intelligence across their operations.

Mastering Data Security and Compliance: CoorsTek's Journey With Databricks Unity Catalog

Ensuring data security & meeting compliance requirements are critical priorities for businesses operating in regulated industries, where the stakes are high and the standards are stringent. We will showcase how CoorsTek, a global leader in technical ceramics MFG, partnered with Databricks to leverage the power of UC for addressing regulatory challenges while achieving significant operational efficiency gains. We'll dive into the migration journey, highlighting the adoption of key features such as RBAC, comprehensive data lineage tracking and robust auditing capabilities. Attendees will gain practical insights into the strategies and tools used to manage sensitive data, ensure compliance with industry standards and optimize cloud data architectures. Additionally, we’ll share real-world lessons learned, best practices for integrating compliance into a modern data ecosystem and actionable takeaways for leveraging Databricks as a catalyst for secure and compliant data innovation.

Patients Are Waiting...Accelerating Healthcare Innovation with Data, AI and Agents

This session is repeated. In an era of exponential data growth, organizations across industries face common challenges in transforming raw data into actionable insights. This presentation showcases how Novo Nordisk is pioneering insights generation approaches to clinical data management and AI. Using our clinical trials platform FounData, built on Databricks, we demonstrate how proper data architecture enables advanced AI applications. We'll introduce a multi-agent AI framework that revolutionizes data interaction, combining specialized AI agents to guide users through complex datasets. While our focus is on clinical data, these principles apply across sectors – from manufacturing to financial services. Learn how democratizing access to data and AI capabilities can transform organizational efficiency while maintaining governance. Through this real-world implementation, participants will gain insights on building scalable data architectures and leveraging multi-agent AI frameworks for responsible innovation.

As a global energy leader, Petrobras relies on machine learning to optimize operations, but manual model deployment and validation processes once created bottlenecks that delayed critical insights. In this session, we’ll reveal how we revolutionized our MLOps framework using MLflow, Databricks Asset Bundles (DABs) and Unity Catalog to: Replace error-prone manual validation with automated metric-driven workflows Reduce model deployment timelines from days to hours Establish granular governance and reproducibility across production models Discover how we enabled data scientists to focus on innovation—not infrastructure—through standardized pipelines while ensuring compliance and scalability in one of the world’s most complex energy ecosystems.

Swimming at Our Own Lakehouse: How Databricks Uses Databricks

This session is repeated. Peek behind the curtain to learn how Databricks processes hundreds of petabytes of data across every region and cloud where we operate. Learn how Databricks leverages Data and AI to scale and optimize every aspect of the company. From facilities and legal to sales and marketing and of course product research and development. This session is a high-level tour inside Databricks to see how Data and AI enable us to be a better company. We will go into the architecture of things for how Databricks is used for internal use cases like business analytics and SIEM as well as customer-facing features like system tables and assistant. We will cover how data production of our data flow and how we maintain security and privacy while operating a large multi-cloud, multi-region environment.

As global data privacy regulations tighten, balancing user data protection with maximizing its business value is crucial.This presentation explores how integrating Databricks into our connected-vehicle data platform enhances both governance and business outcomes. We’ll highlight a case where migrating from EMR to Databricks improved deletion performance and cut costs by 99% with Delta Lake. This shift not only ensures compliance with data-privacy regulations but also maximizes the potential of connected-vehicle data. We are developing a platform that balances compliance with business value and sets a global standard for data usage, inviting partners to join us in building a secure, efficient mobility ecosystem.

Trillions of Data Records, Zero Bottlenecks for Investor Decision-Making

In finance, every second counts. That’s why the Data team at J. Goldman & Co. needed to transform trillions of real-time market data records into a single, actionable insight — instantly, and without waiting on development resources. By modernizing their internal data platform with a scalable architecture, they built a streamlined, web-native alternative data interface that puts live market data directly in the hands of investment teams. With Databricks’ computational power and Unity Catalog’s secure governance, they eliminated bottlenecks and achieved the fastest time-to-market for critical investor decisions possible. Learn how J. Goldman & Co. Innovates with Databricks and Sigma to: Ensure live, scalable data access across trillions of records in a flexible UI Empower non-technical teams with true self-service data exploration

Trust You Can Measure: Data Quality Standards in The Lakehouse

Do you trust your data? If you’ve ever struggled to figure out which datasets are reliable, well-governed, or safe to use, you’re not alone. At Databricks, our own internal lakehouse faced the same challenge—hundreds of thousands of tables, but no easy way to tell which data met quality standards. In this talk, the Databricks Data Platform team shares how we tackled this problem by building the Data Governance Score—a way to systematically measure and surface trust signals across the entire lakehouse. You’ll learn how we leverage Unity Catalog, governed tags, and enforcement to drive better data decisions at scale. Whether you're a data engineer, platform owner, or business leader, you’ll leave with practical ideas on how to raise the bar for data quality and trust in your own data ecosystem.

Analyst Roadmap to Databricks: From SQL to End-to-End BI

Analysts often begin their Databricks journey by running familiar SQL queries in the SQL Editor, but that’s just the start. In this session, I’ll share the roadmap I followed to expand beyond ad-hoc querying into SQL Editor/notebook-driven development to scheduled data pipelines producing interactive dashboards — all powered by Databricks SQL and Unity Catalog. You’ll learn how to organize tables with primary-key/foreign-key relationships along with creating table and column comments to form the semantic model, utilizing DBSQL features like RELY constraints. I’ll also show how parameterized dashboards can be set up to empower self-service analytics and feed into Genie Spaces. Attendees will walk away with best practices for starting out with building a robust BI platform on Databricks, including tips for table design and metadata enrichment. Whether you’re a data analyst or BI developer, this talk will help you unlock powerful, AI-enhanced analytics workflows.

Building Reliable Agentic AI on Databricks

Agentic AI is the next evolution in artificial intelligence, with the potential to revolutionize the industry. However, its potential is matched only by its risk: without high-quality, trustworthy data, agentic AI can be exponentially dangerous. Join Barr Moses, CEO and Co-Founder of Monte Carlo, to explore how to leverage Databricks' powerful platform to ensure your agentic AI initiatives are underpinned by reliable, high-quality data. Barr will share: How data quality impacts agentic AI performance at every stage of the pipeline Strategies for implementing data observability to detect and resolve data issues in real-time Best practices for building robust, error-resilient agentic AI models on Databricks. Real-world examples of businesses harnessing Databricks' scalability and Monte Carlo’s observability to drive trustworthy AI outcomes Learn how your organization can deliver more reliable agentic AI and turn the promise of autonomous intelligence into a strategic advantage.Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

From Overwhelmed to Empowered: How SAP is Democratizing Data & AI with Databricks to Solve Problems

From Overwhelmed to Empowered - How SAP is Democratizing Data & AI to Solve Real Business Problems with Databricks: Scaling the adoption of Data & AI within enterprises is critical for driving transformative business outcomes. Learn how the SAP Experience Garage, SAP’s largest internal enablement and innovation driver, is turning all employees into data enthusiast through the integration of Databricks technologies. The SAP Experience Garage platform brings together colleagues with various levels of data knowledge and skills in one seamless space. Here, they can explore and use tangible datasets and data science/AI tooling from Databricks, enablement capabilities, and collaborative features to tackle real business-related challenges and create prototypes that find their way into SAP’s ecosystem.

Sponsored by: Salesforce | Elevate Agentforce experience with Databricks Agents and Zero Copy

See how Agentforce connects with Databricks to create a seamless, intelligent workspace. With zero copy integration, users can access real-time Databricks data without moving or duplicating it. Explore how Agentforce automatically delegates tasks to a Databricks agent and enables end-to-end execution without leaving the flow of work,, eliminating the swivel-chair effect. Together, these capabilities power a unified, cross-platform experience that drives faster decisions and smarter outcomes.

The JLL Training and Upskill Program for Our Warehouse Migration to Databricks

Databricks Odyssey is JLL’s bespoke training program designed to upskill and prepare data professionals for a new world of data lakehouse. Based on the concepts of learn, practice and certify, participants earn points, moving through five levels by completing activities with business application of Databricks key features. Databricks Odyssey facilitates cloud data warehousing migration by providing best practice frameworks, ensuring efficient use of pay-per-compute platforms. JLL/T Insights and Data fosters a data culture through learning programs that develop in-house talent and create career pathways. Databricks Odyssey offers: JLL-specific hands-on learning Gamified 'level up' approach Practical, applicable skills Benefits include: Improved platform efficiency Enhanced data accuracy and client insights Ongoing professional development Potential cost savings through better utilization

Accelerating Model Development and Fine-Tuning on Databricks with TwelveLabs

Scaling large language models (LLMs) and multimodal architectures requires efficient data management and computational power. NVIDIA NeMo Framework Megatron-LM on Databricks is an open source solution that integrates GPU acceleration and advanced parallelism with Databricks Delta Lakehouse, streamlining workflows for pre-training and fine-tuning models at scale. This session highlights context parallelism, a unique NeMo capability for parallelizing over sequence lengths, making it ideal for video datasets with large embeddings. Through the case study of TwelveLabs’ Pegasus-1 model, learn how NeMo empowers scalable multimodal AI development, from text to video processing, setting a new standard for LLM workflows.

At Zillow, we have accelerated the volume and quality of our dashboards by leveraging a modern SDLC with version control and CI/CD. In the past three months, we have released 32 production-grade dashboards and shared them securely across the organization while cutting error rates in half over that span. In this session, we will provide an overview of how we utilize Databricks asset bundles and GitLab CI/CD to create performant dashboards that can be confidently used for mission-critical operations. As a concrete example, we'll then explore how Zillow's Data Platform team used this approach to automate our on-call support analysis, leveraging our dashboard development strategy alongside Databricks LLM offerings to create a comprehensive view that provides actionable performance metrics alongside AI-generated insights and action items from the hundreds of requests that make up our support workload.

Comprehensive Data Warehouse Migrations to Databricks SQL

This session is repeated. Databricks has a free, comprehensive solution for migrating legacy data warehouses from a wide range of source systems. See how we accelerate migrations from legacy data warehouses to Databricks SQL, achieving 50% faster migration than traditional methods. We'll cover the tool’s automated migration process: Discovery: Source system profiling Assessment: Legacy code analysis Conversion: Advanced code transpilation Reconciliation: Data validation This comprehensive approach increases the predictability of migration projects, allowing businesses to plan and execute migrations with greater confidence.