talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

509

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
Analyst Roadmap to Databricks: From SQL to End-to-End BI

Analysts often begin their Databricks journey by running familiar SQL queries in the SQL Editor, but that’s just the start. In this session, I’ll share the roadmap I followed to expand beyond ad-hoc querying into SQL Editor/notebook-driven development to scheduled data pipelines producing interactive dashboards — all powered by Databricks SQL and Unity Catalog. You’ll learn how to organize tables with primary-key/foreign-key relationships along with creating table and column comments to form the semantic model, utilizing DBSQL features like RELY constraints. I’ll also show how parameterized dashboards can be set up to empower self-service analytics and feed into Genie Spaces. Attendees will walk away with best practices for starting out with building a robust BI platform on Databricks, including tips for table design and metadata enrichment. Whether you’re a data analyst or BI developer, this talk will help you unlock powerful, AI-enhanced analytics workflows.

Building Reliable Agentic AI on Databricks

Agentic AI is the next evolution in artificial intelligence, with the potential to revolutionize the industry. However, its potential is matched only by its risk: without high-quality, trustworthy data, agentic AI can be exponentially dangerous. Join Barr Moses, CEO and Co-Founder of Monte Carlo, to explore how to leverage Databricks' powerful platform to ensure your agentic AI initiatives are underpinned by reliable, high-quality data. Barr will share: How data quality impacts agentic AI performance at every stage of the pipeline Strategies for implementing data observability to detect and resolve data issues in real-time Best practices for building robust, error-resilient agentic AI models on Databricks. Real-world examples of businesses harnessing Databricks' scalability and Monte Carlo’s observability to drive trustworthy AI outcomes Learn how your organization can deliver more reliable agentic AI and turn the promise of autonomous intelligence into a strategic advantage.Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

From Overwhelmed to Empowered: How SAP is Democratizing Data & AI with Databricks to Solve Problems

From Overwhelmed to Empowered - How SAP is Democratizing Data & AI to Solve Real Business Problems with Databricks: Scaling the adoption of Data & AI within enterprises is critical for driving transformative business outcomes. Learn how the SAP Experience Garage, SAP’s largest internal enablement and innovation driver, is turning all employees into data enthusiast through the integration of Databricks technologies. The SAP Experience Garage platform brings together colleagues with various levels of data knowledge and skills in one seamless space. Here, they can explore and use tangible datasets and data science/AI tooling from Databricks, enablement capabilities, and collaborative features to tackle real business-related challenges and create prototypes that find their way into SAP’s ecosystem.

Sponsored by: Salesforce | Elevate Agentforce experience with Databricks Agents and Zero Copy

See how Agentforce connects with Databricks to create a seamless, intelligent workspace. With zero copy integration, users can access real-time Databricks data without moving or duplicating it. Explore how Agentforce automatically delegates tasks to a Databricks agent and enables end-to-end execution without leaving the flow of work,, eliminating the swivel-chair effect. Together, these capabilities power a unified, cross-platform experience that drives faster decisions and smarter outcomes.

The JLL Training and Upskill Program for Our Warehouse Migration to Databricks

Databricks Odyssey is JLL’s bespoke training program designed to upskill and prepare data professionals for a new world of data lakehouse. Based on the concepts of learn, practice and certify, participants earn points, moving through five levels by completing activities with business application of Databricks key features. Databricks Odyssey facilitates cloud data warehousing migration by providing best practice frameworks, ensuring efficient use of pay-per-compute platforms. JLL/T Insights and Data fosters a data culture through learning programs that develop in-house talent and create career pathways. Databricks Odyssey offers: JLL-specific hands-on learning Gamified 'level up' approach Practical, applicable skills Benefits include: Improved platform efficiency Enhanced data accuracy and client insights Ongoing professional development Potential cost savings through better utilization

Accelerating Model Development and Fine-Tuning on Databricks with TwelveLabs

Scaling large language models (LLMs) and multimodal architectures requires efficient data management and computational power. NVIDIA NeMo Framework Megatron-LM on Databricks is an open source solution that integrates GPU acceleration and advanced parallelism with Databricks Delta Lakehouse, streamlining workflows for pre-training and fine-tuning models at scale. This session highlights context parallelism, a unique NeMo capability for parallelizing over sequence lengths, making it ideal for video datasets with large embeddings. Through the case study of TwelveLabs’ Pegasus-1 model, learn how NeMo empowers scalable multimodal AI development, from text to video processing, setting a new standard for LLM workflows.

At Zillow, we have accelerated the volume and quality of our dashboards by leveraging a modern SDLC with version control and CI/CD. In the past three months, we have released 32 production-grade dashboards and shared them securely across the organization while cutting error rates in half over that span. In this session, we will provide an overview of how we utilize Databricks asset bundles and GitLab CI/CD to create performant dashboards that can be confidently used for mission-critical operations. As a concrete example, we'll then explore how Zillow's Data Platform team used this approach to automate our on-call support analysis, leveraging our dashboard development strategy alongside Databricks LLM offerings to create a comprehensive view that provides actionable performance metrics alongside AI-generated insights and action items from the hundreds of requests that make up our support workload.

Comprehensive Data Warehouse Migrations to Databricks SQL

This session is repeated. Databricks has a free, comprehensive solution for migrating legacy data warehouses from a wide range of source systems. See how we accelerate migrations from legacy data warehouses to Databricks SQL, achieving 50% faster migration than traditional methods. We'll cover the tool’s automated migration process: Discovery: Source system profiling Assessment: Legacy code analysis Conversion: Advanced code transpilation Reconciliation: Data validation This comprehensive approach increases the predictability of migration projects, allowing businesses to plan and execute migrations with greater confidence.

Democratizing Data in a Regulated Industry: Best Practices and Outcomes With J.P. Morgan Payments

Join our 2024 Databricks Disruptor award winners for a session on how they leveraged the Databricks and AWS platforms to build an internal technology marketplace in the highly regulated banking industry empowering end-users to innovate and own their data sets while maintaining strict compliance. In this talk, leaders from the J.P. Morgan Payments Data team share how they’ve done it — from keeping customer needs at the center of all decision-making to promoting a culture of experimentation. They’ll also expand upon how J.P. Morgan Payments products team now leverages the data platform they’ve built to create customer products including Cash Flow Intelligence.

FinOps at Scale: Best Practices for Cost-Efficient Growth on Databricks

This session is repeated. You’ve seen your usage grow on Databricks, across departments, use cases, product lines and users. What can you do to ensure your end-users (data practitioners) of the platform remain cost-efficient and productive, while staying accountable to your budget? We’ll discuss spend monitoring, chargeback models and developing a culture of cost efficiency by using Databricks tools.

How Databricks Powers Real-Time Threat Detection at Barracuda XDR

As cybersecurity threats grow in volume and complexity, organizations must efficiently process security telemetry for best-in-class detection and mitigation. Barracuda’s XDR platform is redefining security operations by layering advanced detection methodologies over a broad range of supported technologies. Our vision is to deliver unparalleled protection through automation, machine learning and scalable detection frameworks, ensuring threats are identified and mitigated quickly. To achieve this, we have adopted Databricks as the foundation of our security analytics platform, providing greater control and flexibility while decoupling from traditional SIEM tools. By leveraging Lakeflow Declarative Pipelines, Spark Structured Streaming and detection-as-code CI/CD pipelines, we have built a real-time detection engine that enhances scalability, accuracy and cost efficiency. This session explores how Databricks is shaping the future of XDR through real-time analytics and cloud-native security.

Let's Save Tons of Money With Cloud-Native Data Ingestion!

Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.

Power BI and Databricks: Practical Best Practices

This session is repeated. Power BI has long been the dominant BI tool in the market. In this session, we'll discuss how to get the most out of PowerBI and Databricks, beginning with high-level architecture and moving down into detailed how-to guides for troubleshooting common failure points. At the end, you'll receive a cheat-sheet which summarizes those best practices into an easy-to-reference format.

Scaling Demand Forecasting at Nikon: Automating Camera Accessories Sales Planning with Databricks

At Nikon, camera accessories are essential in meeting the diverse needs of professional photographers worldwide, making their timely availability a priority. Forecasting accessories, however, presents unique challenges including dependencies on parent products, sparse demand patterns, and managing predictions for thousands of items across global subsidiaries. To address this, we leveraged Databricks' unified data and AI platform to develop and deploy an automated, scalable solution for accessory sales planning. Our solution employs a hybrid approach that auto-selects best algorithm from a suite of ML and time-series models, incorporating anomaly detection and methods to handle sparse and low-demand scenarios. MLflow is utilized to automate model logging and versioning, enabling efficient management, and scalable deployment. The framework includes data preparation, model selection and training, performance tracking, prediction generation, and output processing for downstream systems.

Unlocking the Future of Dairy Farming: Leveraging Data Marketplaces at Lely

Lely, a Dutch company specializing in dairy farming robotics, helps farmers with advanced solutions for milking, feeding and cleaning. This session explores Lely’s implementation of an Internal Data Marketplace, built around Databricks' Private Exchange Marketplace. The marketplace serves as a central hub for data teams and business users, offering seamless access to data, analytics and dashboards. Powered by Delta Sharing, it enables secure, private listing of data products across business domains, including notebooks, views, models and functions. This session covers the pros and cons of this approach, best practices for setting up a data marketplace and its impact on Lely’s operations. Real-world examples and insights will showcase the potential of integrating data-driven solutions into dairy farming. Join us to discover how data innovation drives the future of dairy farming through Lely’s experience.

AI Agents in Action: Structuring Unstructured Data on Demand With Databricks and Unstructured

LLM agents aren’t just answering questions — they’re running entire workflows. In this talk, we’ll show how agents can autonomously ingest, process and structure unstructured data using Unstructured, with outputs flowing directly into Databricks. Powered by the Model Context Protocol (MCP), agents can interface with Unstructured’s full suite of capabilities — discovering documents across sources, building ephemeral workflows and exporting structured insights into Delta tables. We’ll walk through a demo where an agent responds to a natural language request, dynamically pulls relevant documents, transforms them into usable data and surfaces insights — fast. Join us for a sneak peek into the future of AI-native data workflows, where LLMs don’t just assist — they operate.

Breaking Silos: Cigna’s Journey to Seamless Data Sharing with Delta Sharing

As data ecosystems grow increasingly complex, the ability to share data securely, seamlessly, and in real time has become a strategic differentiator. In this session, Cigna will showcase how Delta Sharing on Databricks has enabled them to modernize data delivery, reduce operational overhead, and unlock new market opportunities. Learn how Cigna achieved significant savings by streamlining operations, compute, and platform overhead for just one use case. Explore how decentralizing data ownership—transitioning from hyper-centralized teams to empowered product owners—has simplified delivery and accelerated innovation. Most importantly, see how this modern open data-sharing framework has positioned Cigna to win contracts they previously couldn’t, by enabling real-time, cross-organizational data collaboration with external partners. Join us to hear how Cigna is using Delta Sharing not just as a technical enabler, but as a business catalyst.

Cracking Complex Documents with Databricks Mosaic AI

In this session, we will share how we are transforming the way organizations process unstructured and non-standard documents using Mosaic AI and agentic patterns within the Databricks ecosystem. We have developed a scalable pipeline that turns complex legal and regulatory content into structured, tabular data.We will walk through the full architecture, which includes Unity Catalog for secure and governed data access, Databricks Vector Search for intelligent indexing and retrieval and Databricks Apps to deliver clear insights to business users. The solution supports multiple languages and formats, making it suitable for teams working across different regions. We will also discuss some of the key technical challenges we addressed, including handling parsing inconsistencies, grounding model responses and ensuring traceability across the entire process. If you are exploring how to apply GenAI and large language models, this session is for you. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Sponsored by: Deloitte | Accelerating Biopharmaceutical Breakthroughs with an Innovative Enterprise Data Strategy

In the rapidly evolving life sciences and healthcare industry, leveraging data-as-a-product is crucial for driving innovation and achieving business objectives. Join us to explore how Deloitte is revolutionizing data strategy solutions by overcoming challenges such as data silos, poor data quality, and lack of real-time insights with the Databricks Data Intelligence Platform. Learn how effective data governance, seamless data integration, and scalable architectures support personalized medicine, regulatory compliance, and operational efficiency. This session will highlight how these strategies enable biopharma companies to transform data into actionable insights, accelerate breakthroughs and enhance life sciences outcomes.

Sponsored by: SAP | SAP and Databricks Open a Bold New Era of Data and AI​

SAP and Databricks have formed a landmark partnership that brings together SAP's deep expertise in mission-critical business processes and semantically rich data with Databricks' industry-leading capabilities in AI, machine learning, and advanced data engineering. From curated, SAP-managed data products to zero-copy Delta Sharing integration, discover how SAP Business Data Cloud empowers data and AI professionals to build AI solutions that unlock unparalleled business insights using trusted business data.