talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

1286

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

1286 activities · Newest first

Beyond Chatbots: Building Autonomous Insurance Applications With Agentic AI Framework

The insurance industry is at the crossroads of digital transformation, facing challenges from market competition and customer expectations. While conventional ML applications have historically provided capabilities in this domain, the emergence of Agentic AI frameworks presents a revolutionary opportunity to build truly autonomous insurance applications. We will address issues related to data governance and quality while discussing how to monitor/evaluate fine-tune models. We'll demonstrate the application of the agentic framework in the insurance context and how these autonomous agents can work collaboratively to handle complex insurance workflows — from submission intake and risk evaluation to expedited quote generation. This session demonstrates how to architect intelligent insurance solutions using Databricks Mosaic AI agentic core components including Unity Catalog, Playground, model evaluation/guardrails, privacy filters, AI functions and AI/BI Genie.

Breaking Up With Spark Versions: Client APIs, AI-Powered Automatic Updates, and Dependency Management for Databricks Serverless

This session explains how we've made our Apache Spark™ versionless for end users by introducing a stable client API, environment versioning and automatic remediation. These capabilities have enabled auto-upgrade of hundreds of millions of workloads with minimal disruption for Serverless Notebooks and Jobs. We'll also introduce a new approach to dependency management using environments. Admins will learn how to speed up package installation with Default Base Environments, and users will see how to manage custom environments for their own workloads.

Databricks + Apache Iceberg™: Managed and Foreign Tables in Unity Catalog

Unity Catalog support for Apache Iceberg™ brings open, interoperable table formats to the heart of the Databricks Lakehouse. In this session, we’ll introduce new capabilities that allow you to write Iceberg tables from any REST-compatible engine, apply fine-grained governance across all data, and unify access to external Iceberg catalogs like AWS Glue, Hive Metastore, and Snowflake Horizon. Learn how Databricks is eliminating data silos, simplifying performance with Predictive Optimization, and advancing a truly open lakehouse architecture with Delta and Iceberg side by side.

Got Metrics? Build a Metric Store — A Tour of Developing Metrics Through UC Metric Views

I have metrics, you have metrics — we all have metrics. But the real problem isn’t having metrics, it’s that the numbers never line up, leading to endless cycles of reconciliation and confusion. Join us as we share how our Data Team at Databricks tackled this fundamental challenge in Business Intelligence by building an internal Metric Store — creating a single source of truth for all business metrics using the newly-launched UC Metric Views. Imagine a world where numbers always align, metric definitions are consistently applied across the organization and every metric comes with built-in ML-based forecasting, AI-powered anomaly detection and automatic explainability. That’s the future we’ve built — and we’ll show you how you can get started today.

Lakeflow Observability: From UI Monitoring to Deep Analytics

Monitoring data pipelines is key to reliability at scale. In this session, we’ll dive into the observability experience in Lakeflow, Databricks’ unified DE solution — from intuitive UI monitoring to advanced event analysis, cost observability and custom dashboards. We’ll walk through the revamped UX for Lakeflow observability, showing how to: Monitor runs and task states, dependencies and retry behavior in the UI Set up alerts for job and pipeline outcomes + failures Use pipeline and job system tables for historical insights Explore run events and event logs for root cause analysis Analyze metadata to understand and optimize pipeline spend How to build custom dashboards using system tables to track performance data quality, freshness, SLAs and failure trends, and drive automated alerting based on real-time signals This session will help you unlock full visibility into your data workflows.

Latest Innovations in AI/BI Dashboards and Genie

Discover how the latest innovations in Databricks AI/BI Dashboards and Genie are transforming self-service analytics. This session offers a high-level tour of new capabilities that empower business users to ask questions in natural language, generate insights faster and make smarter decisions. Whether you're a long-time Databricks user or just exploring what's possible with AI/BI, you'll walk away with a clear understanding of how these tools are evolving — and how to leverage them for greater business impact.

Performance Best Practices for Fast Queries, High Concurrency, and Scaling on Databricks SQL

Data warehousing in enterprise and mission-critical environments needs special consideration for price/performance. This session will explain how Databricks SQL addresses the most challenging requirements for high-concurrency, low-latency performance at scale. We will also cover the latest advancements in resource-based scheduling, autoscaling and caching enhancements that allow for seamless performance and workload management.

Securely Deploying AI/BI to All Users in Your Enterprise

Bringing AI/BI to every business user starts with getting security, access and governance right. In this session, we’ll walk through the latest best practices for configuring Databricks accounts, setting up workspaces, and managing authentication protocols to enable secure and scalable onboarding. Whether you're supporting a small team or an entire enterprise, you'll gain practical insights to protect your data while ensuring seamless and governed access to AI/BI tools.

Supercharging Sales Intelligence: Processing Billions of Events via Structured Streaming

DigiCert is a digital security company that provides digital certificates, encryption and authentication services and serves 88% of the Fortune 500, securing over 28 billion web connections daily. Our project aggregates and analyzes certificate transparency logs via public APIs to provide comprehensive market and competitive intelligence. Instead of relying on third-party providers with limited data, our project gives full control, deeper insights and automation. Databricks has helped us reliably poll public APIs in a scalable manner that fetches millions of events daily, deduplicate and store them in our Delta tables. We specifically use Spark for parallel processing, structured streaming for real-time ingestion and deduplication, Delta tables for data reliability, pools and jobs to ensure our costs are optimized. These technologies help us keep our data fresh, accurate and cost effective. This data has helped our sales team with real-time intelligence, ensuring DigiCert's success.

Tracking Data and AI Lineage: Ensuring Transparency and Compliance

As AI becomes more deeply integrated into data platforms, understanding where data comes from — and where it goes — is essential for ensuring transparency, compliance and trust. In this session, we’ll explore the newest advancements in data and AI lineage across the Databricks Platform, including during model training, evaluation and inference. You’ll also learn how lineage system tables can be used for impact analysis and to gain usage insights across your data estate. We’ll cover newly released capabilities — such as Bring Your Own Lineage — that enable an end-to-end view of your data and AI assets in Unity Catalog. Plus, get a sneak peek at what’s coming next on the lineage roadmap!

Unlocking the Databricks Marketplace: A Hands-On Guide for Data Consumers and Providers

Curious about how to get real value from the Databricks Marketplace—whether you're consuming data or sharing it? This demo-heavy session answers the top 10 questions we hear from both data consumers and providers, with real examples you can put into practice right away. We’ll show consumers how to find the right product listing whether that's tables, files, AI models, solution accelerators, or Partner Connect integrations, try them out using sample notebooks, and access them with ease. You’ll also see how the Private Marketplace helps teams work more efficiently with a curated catalog of approved data. For providers, learn how to list your product in a way that stands out, use notebooks and documentation to help users get started, reach new audiences, and securely share data across your company or with trusted partners using the Private Marketplace. If you’ve ever asked, “How do I get started?” or “How do I make my data available internally or externally?”—this session has the answers, with demos to match.

What’s New in Databricks SQL: Latest Features and Live Demos

Databricks SQL has added significant features in the last year at a fast pace. This session will share the most impactful features and the customer use cases that inspired them. We will highlight the new SQL editor, SQL coding features, streaming tables and materialized views, BI integrations, cost management features, system tables and observability features, and more. We will also share AI-powered performance optimizations.

Kill Bill-ing? Revenge is a Dish Best Served Optimized with GenAI

In an era where cloud costs can spiral out of control, Sportsbet achieved a remarkable 49% reduction in Total Cost of Ownership (TCO) through an innovative AI-powered solution called 'Kill Bill.' This presentation reveals how we transformed Databricks' consumption-based pricing model from a challenge into a strategic advantage through an intelligent automation and optimization. Understand how to use GenAI to reduce Databricks TCO Leverage generative AI within Databricks solutions enables automated analysis of cluster logs, resource consumption, configurations, and codebases to provide Spark optimization suggestions Create AI agentic workflows by integrating Databricks' AI tools and Databricks Data Engineering tools Review a case study demonstrating how Total Cost of Ownership was reduced in practice. Attendees will leave with a clear understanding of how to implement AI within Databricks solutions to address similar cost challenges in their environments.

Solving Exclusive Data Access With Role-Based Access Control

Do you have users that wear multiple hats over a day? Like working with data from various customers and hoping they don’t inadvertently aggregate data? Or are they working on sensitive datasets such as clinical trials that should not be combined, or are data sets that are subject to regulations? We have a solution! In this session, we will present a new capability that allows users wearing multiple hats to switch roles in the Databricks workspace to work exclusively on a dedicated project, data of a particular client or clinical trial. When switching to a particular role, the workspace adapts in such a way that only workspace objects and UC data of that particular role are accessible. We will also showcase the administrative experience of setting up exclusive access using groups and UC permissions.

Solving Health AI’s Data Problem

AI in healthcare has a data problem. Fragmented data remains one of the biggest challenges, and bottlenecks the development and deployment of AI solutions across life sciences, payers, and providers. Legacy paper-driven workflows and fragmented technology perpetuate silos, making it difficult to create a comprehensive, real-time picture of patient health. Datavant is leveraging Databricks and AWS technology to solve this problem at scale. Through our partnership with Databricks, we are centralizing storage of clinical data from what is arguably the largest health data network so that we can transform it into structured, AI-ready data – and shave off 80 percent of the work of deploying a new AI use case. Learn how we are handling the complexity of this effort while preserving the integrity of source data. We’ll also share early use cases now available to our healthcare customers.

Sponsored by: DataHub | Beyond the Lakehouse: Supercharging Databricks with Contextual Intelligence

While Databricks powers your data lakehouse, DataHub delivers the critical context layer connecting your entire ecosystem. We'll demonstrate how DataHub extends Unity Catalog to provide comprehensive metadata intelligence across platforms. DataHub's real-time platform:Cut AI model time-to-market with our unified REST and GraphQL APIs that ensure models train on reliable and compliant data from across platforms, with complete lineage trackingDecrease data incidents by 60% using our event-driven architecture that instantly propagates changes across systems*Transform data discovery from days to minutes with AI-powered search and natural language interfaces.Leaders use DataHub to transform Databricks data into integrated insights that drive business value. See our demo of syncback technology—detecting sensitive data and enforcing Databricks access controls automatically—plus our AI assistant that enhances' LLMs with cross-platform metadata.

Sponsored by: definity | How You Could Be Saving 50% of Your Spark Costs

Enterprise lakehouse platforms are rapidly scaling – and so are complexity and cost. After monitoring over 1B vCore-hours across Databricks and other Apache Spark™ environments, we consistently saw resource waste, preventable data incidents, and painful troubleshooting. Join this session to discover how definity’s unique full-stack observability provides job-level visibility in-motion, unifying infrastructure performance, pipeline execution, and data behavior, and see how enterprise teams use definity to easily optimize jobs and save millions – while proactively ensuring SLAs, preventing issues, and simplifying RCA.

Sponsored by: Retool | Retooling Intelligence: Build Scalable, Secure AI Agents for the Enterprise with Databricks + Retool

Enterprises need AI agents that are both powerful and production-ready while being scalable and secure. In this lightning session, you’ll learn how to leverage Retool’s platform and Databricks to design, deploy, and manage intelligent agents that automate complex workflows. We’ll cover best practices for integrating real-time Databricks data, enforcing governance, and ensuring scalability all while avoiding common pitfalls. Whether you’re automating internal ops or customer-facing tasks, walk away with a blueprint for shipping AI agents that actually work in the real world.

Summit Live: AI/BI Genie & Dashboards - Talk With Your Data With GenAI Powered Business Intelligence

AI/BI Genie lets anyone simply talk with their own data, using natural language, fully secured through UC to provide accurate answers within the context for your organization. AI/BI Dashboards goes beyond traditional BI tools, democratizing everyone to self-serve immediate interactive visuals on your own secured data. Hear from a customer and Databricks experts on the latest developments.