talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

509

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
From Code to Insights: Leveraging Advanced Infrastructure and AI Capabilities.

In this talk, we will explore how AI and advanced infrastructure are transforming Insulet's development and operations. We'll highlight how our innovations have reduced scrap part costs through manufacturing analytics, showcasing efficiency and cost savings. On leveraging Databricks AI solutions and productivity, it not only identifies errors but also fixes code and assists in writing complex queries. This goes beyond suggestions, providing actual solutions. On the infrastructure side, integrating Spark with Databricks simplifies setup and reduces costs. Additionally Databricks Lakeflow Connect enables real-time updates and simplification without much coding as we integrate with Salesforce. We'll also discuss real-time processing of patient data, demonstrating how Databricks drives efficiency and productivity. Join us to learn how these innovations enhance efficiency, cost savings and performance.

Geospatial Insights With Databricks SQL: Techniques and Applications

Spatial data is increasingly important, but working with it can be complex. In this session, we’ll explore how Databricks SQL supports spatial analysis and helps analysts and engineers get more value from location-based data. We’ll cover what’s coming in the Public Preview of Spatial SQL, when and how to use the new Geometry and Geography data types, and practical use cases for H3. You’ll also learn about common challenges with spatial data and how we're addressing them, along with a look at the near-term roadmap.

How Anthropic Transforms Financial Services Teams With GenAI

Learn how GenAI is being applied to financial services teams using Claude, an acknowledged leader in large language models. Integrated with the scale and security of the Databricks Data Intelligence Platform, we will share how Claude is enabling financial services organizations to streamline operations, maximize productivity for investment and compliance teams and in some cases turn traditional cost-centers into revenue drivers.

Scaling Modern MDM With Databricks, Delta Sharing and Dun & Bradstreet

Master Data Management (MDM) is the foundation of a successful enterprise data strategy — delivering consistency, accuracy and trust across all systems that depend on reliable data. But how can organizations integrate trusted third-party data to enhance their MDM frameworks? How can they ensure that this master data is securely and efficiently shared across internal platforms and external ecosystems? This session explores how Dun & Bradstreet’s pre-mastered data serves as a single source of truth for customers, suppliers and vendors — reducing duplication and driving alignment across enterprise systems. With Delta Sharing, organizations can natively ingest Dun & Bradstreet data into their Databricks environment and establish a scalable, interoperable MDM framework. Delta Sharing also enables secure, real-time distribution of master data across the enterprise ensuring that every system operates from a consistent and trusted foundation.

Sponsored by: Capital One Software | How Capital One Balances Lower Cost and Peak Performance in Databricks

Companies need a lot of data to build and deploy AI models—and they want it quickly. To meet this demand, platform teams are quickly scaling their Databricks usage, resulting in excess cost driven by inefficiencies and performance anomalies. Capital One has over 4,000 users leveraging Databricks to power advanced analytics and machine learning capabilities at scale. In this talk, we’ll share lessons learned from optimizing our own Databricks usage while balancing lower cost with peak performance. Attendees will learn how to identify top sources of waste, best practices for cluster management, tips for user governance and methods to keep costs in check.

Sponsored by: Domo | Orchestrating Fleet Intelligence with AI Agents and Real-Time IoT With Databricks + DOMO

In today’s logistics landscape, operational continuity depends on real time awareness and proactive decision making. This session presents an AI agent driven solution built on Databricks that transforms real time fleet IoT data into autonomous workflows. Streaming telemetry such as bearing vibration data is ingested and analyzed using FFT to detect anomalies. When a critical pattern is found, an AI agent diagnoses root causes and simulates asset behavior as a digital twin, factoring in geolocation, routing, and context. The agent then generates a corrective strategy by identifying service sites, skilled personnel, and parts, estimating repair time, and orchestrating reroutes. It evaluates alternate delivery vehicles and creates transfer plans for critical shipments. The system features human AI collaboration, enabling teams to review and execute plans. Learn how this architecture reduces downtime and drives resilient, adaptive fleet management.

Sponsored by: Lovelytics | From SAP Silos to Supply Chain Superpower: How AI Is Reinventing Planning

Today’s supply chains demand more than historical insights–they need real-time intelligence. In this actionable session, discover how leading enterprises are unlocking the full potential of their SAP data by integrating it with Databricks and AI. See how CPG companies are transforming supply chain planning by combining SAP ERP data with external signals like weather and transportation data–enabling them to predict disruptions, optimize inventory, and make faster, smarter decisions. Powered by Databricks, this solution delivers true agility and resilience through a unified data architecture. Join us to learn how: You can eliminate SAP data silos and make them ML and AI-ready at scale External data sources amplify SAP use cases like forecasting and scenario planning AI-driven insights accelerate time-to-action across supply chain operations Whether you're just starting your data modernization journey or seeking ROI from SAP analytics, this session will show you what’s possible.

AI-Driven Drug Discovery: Accelerating Molecular Insights With NVIDIA and Databricks

This session is repeated. In the race to revolutionize healthcare and drug discovery, biopharma companies are turning to AI to streamline workflows and unlock new scientific insights. This session, we will explore how NVIDIA BioNeMo, combined with Databricks Delta Lakehouse, can be used for advancing drug discovery for critical applications like molecular structure modeling, protein folding and diagnostics. We’ll demonstrate how BioNeMo pre-trained models can run inference on data securely stored in Delta Lake, delivering actionable insights. By leveraging containerized solutions on Databricks’ ML Runtime with GPU acceleration, users can achieve significant performance gains compared to traditional CPU-based computation.

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.

As first-party data becomes increasingly invaluable to organizations, Walmart Data Ventures is dedicated to bringing to life new applications of Walmart’s first-party data to better serve its customers. Through Scintilla, its integrated insights ecosystem, Walmart Data Ventures continues to expand its offerings to deliver insights and analytics that drive collaboration between our merchants, suppliers, and operators.​Scintilla users can now access Walmart data using Cloud Feeds, based on Databricks Delta Sharing technologies. In the past, Walmart used API-based data sharing models, which required users to possess certain skills and technical attributes that weren’t always available. Now, with Cloud Feeds, Scintilla users can more easily access data without a dedicated technical team behind the scenes making it happen. Attendees will gain valuable insights into how Walmart has built its robust data sharing architecture and strategies to design scalable and collaborative data sharing architectures in their own organizations.

Demystifying Upgrading to Unity Catalog — Challenges, Design and Execution

Databricks Unity Catalog (UC) is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. UC provides a single source of truth for organization’s data and AI, providing open connectivity to any data source, any format, lineage, monitoring and support for open sharing and collaboration. In this session we will discuss the challenges in upgrading to UC from your existing databricks Non-UC set up. We will discuss a few customer use cases and how we overcame difficulties and created a repeatable pattern and reusable assets to replicate the success of upgrading to UC across some of the largest databricks customers. It is co-presented with our partner Celebal Technologies.

Dusting off the Cobwebs — Moving off a 26-year-old Heritage Platform to Databricks [Teradata]

Join us to hear about how National Australia Bank (NAB) successfully completed a significant milestone in its data strategy by decommissioning its 26-year-old Teradata environment and migrating to a new strategic data platform called 'Ada'. This transition marks a pivotal shift from legacy systems to a modern, cloud-based data and AI platform powered by Databricks. The migration process, which spanned two years, involved ingesting 16 data sources, transferring 456 use cases, and collaborating with hundreds of users across 12 business units. This strategic move positions NAB to leverage the full potential of cloud-native data analytics, enabling more agile and data-driven decision-making across the organization. The successful migration to Ada represents a significant step forward in NAB's ongoing efforts to modernize its data infrastructure and capitalize on emerging technologies in the rapidly evolving financial services landscape

Empowering Progress: Building a Personalized Training Goal Ecosystem with Databricks

Tonal is the ultimate strength training system, giving you the expertise of a personal trainer and a full gym in your home. Through user interviews and social media feedback, we identified a consistent challenge: members found it difficult to measure their progress in their fitness journey. To address this, we developed the Training Goal (TG) ecosystem, a four-part solution that introduced new preference options to capture users' fitness aspirations, implemented weekly metrics that accumulate as members complete workouts, defined personalized weekly targets to guide progress, and enhanced workout details to show how each session contributes toward individual goals.We present how we leveraged Spark, MLflow, and Workflows within the Databricks ecosystem to compute TG metrics, manage model development, and orchestrate data pipelines. These tools allowed us to launch the TG system on schedule, supporting scalability, reliability, and a more meaningful, personalized way for members to track their progress.

From Days to Seconds — Reducing Query Times on Large Geospatial Datasets by 99%

The Global Water Security Center translates environmental science into actionable insights for the U.S. Department of Defense. Prior to incorporating Databricks, responding to these requests required querying approximately five hundred thousand raster files representing over five hundred billion points. By leveraging lakehouse architecture, Databricks Auto Loader, Spark Streaming, Databricks Spatial SQL, H3 geospatial indexing and Databricks Liquid Clustering, we were able to drastically reduce our “time to analysis” from multiple business days to a matter of seconds. Now, our data scientists execute queries on pre-computed tables in Databricks, resulting in a “time to analysis” that is 99% faster, giving our teams more time for deeper analysis of the data. Additionally, we’ve incorporated Databricks Workflows, Databricks Asset Bundles, Git and Git Actions to support CI/CD across workspaces. We completed this work in close partnership with Databricks.

GenAI Observability in Customer Care

Customer support is going through the GenAI revolution, but how can we use AI to foster deeper empathy with our end users?To enable this, Earnin has built its GenAI observability platform on Databricks, leveraging Lakeflow Declarative Pipeliness, Kafka and Databricks AI/BI.This session covers how we use Lakeflow Declarative Pipelines to monitor our customer care chatbot in near real-time and how we leverage Databricks to better anticipate our customers' needs.

In this session, we will explore how Genie, an AI-driven platform transformed HVAC operational insights by leveraging Databricks offerings like Apache Spark, Delta Lake and the Databricks Data Intelligence Platform.Key contributions: Real-time data processing: Lakeflow Declarative Pipelines and Apache Spark™ for efficient data ingestion and real-time analysis. Workflow orchestration: Databricks Data Intelligence Platform to orchestrate complex workflows and integrate various data sources and analytical tools. Field Data Integration: Incorporating real-time field data into design and algorithm development, enabling engineers to make informed adjustments and optimize performance. By analyzing real-time data from HVAC installations, Genie identified discrepancies between design specs and field performance, allowing engineers to optimize algorithms, reduce inefficiencies and improve customer satisfaction. Discover how Genie revolutionized HVAC management and apply to your projects.

Geo-Powering Insights: The Art of Spatial Data Integration and Visualization

In this presentation, we will explore how to leverage Databricks' SQL engine to efficiently ingest and transform geospatial data. We'll demonstrate the seamless process of connecting to external systems such as ArcGIS to retrieve datasets, showcasing the platform's versatility in handling diverse data sources. We'll then delve into the power of Databricks Apps, illustrating how you can create custom geospatial dashboards using various frameworks like Streamlit and Flask, or any framework of your choice. This flexibility allows you to tailor your visualizations to your specific needs and preferences. Furthermore, we'll highlight the Databricks Lakehouse's integration capabilities with popular dashboarding tools such as Tableau and Power BI. This integration enables you to combine the robust data processing power of Databricks with the advanced visualization features of these specialized tools.

High-Throughput ML: Mastering Efficient Model Serving at Enterprise Scale

Ever wondered how industry leaders handle thousands of ML predictions per second? This session reveals the architecture behind high-performance model serving systems on Databricks. We'll explore how to build inference pipelines that efficiently scale to handle massive request volumes while maintaining low latency. You'll learn how to leverage Feature Store for consistent, low-latency feature lookups and implement auto-scaling strategies that optimize both performance and cost. Key takeaways: Determining optimal compute capacity using the QPS × model execution time formula Configuring Feature Store for high-throughput, low-latency feature retrieval Managing cold starts and scaling strategies for latency-sensitive applications Implementing monitoring systems that provide visibility into inference performance Whether you're serving recommender systems or real-time fraud detection models, you'll gain practical strategies for building enterprise-grade ML serving systems.

How HP Is Optimizing the 3D Printing Supply Chain Using Delta Sharing

HP’s 3D Print division empowers manufacturers with telemetry data to optimize operations and streamline maintenance. Using Delta Sharing, Unity Catalog and AI/BI dashboards, HP provides a secure, scalable solution for data sharing and analytics. Delta Sharing D2O enables seamless data access, even for customers not on Databricks. Apigee masks private URLs, and Unity Catalog enhances security by managing data assets. Predictive maintenance with Mosaic AI boosts uptime by identifying issues early and alerting support teams. Custom dashboards and sample code let customers run analytics using any supported client, while Apigee simplifies access by abstracting complexity. Insights from A/BI dashboards help HP refines data strategy, aligning solutions with customer needs despite the complexity of diverse technologies, fragmented systems and customer-specific requirements. This fosters trust, drives innovation,and strengthens HP as a trusted partner for scalable, secure data solutions.

IQVIA's Analytics for Patient Support Services: Transforming Scalability, Performance and Governance

This presentation will explore the transformation of IQVIA's decade-old patient support platform through the implementation of Databricks Data Intelligence Platform. Facing scalability challenges, performance bottlenecks and rising costs, the existing platform required significant redesign to handle growing data volumes and complex analytics. Key issues included static metrics limiting workflow optimization, fragmented data governance and heightened compliance and security demands. By partnering with Customertimes (a Databricks Partner) and adopting Databricks' centralized, scalable analytics solution with enhanced self-service capabilities, IQVIA achieved improved query performance, cost efficiency and robust governance, ensuring operational effectiveness and regulatory compliance in an increasingly complex environment.