talk-data.com talk-data.com

Topic

AI/ML

Artificial Intelligence/Machine Learning

data_science algorithms predictive_analytics

9014

tagged

Activity Trend

1532 peak/qtr
2020-Q1 2026-Q1

Activities

9014 activities · Newest first

Tracking Data and AI Lineage: Ensuring Transparency and Compliance

As AI becomes more deeply integrated into data platforms, understanding where data comes from — and where it goes — is essential for ensuring transparency, compliance and trust. In this session, we’ll explore the newest advancements in data and AI lineage across the Databricks Platform, including during model training, evaluation and inference. You’ll also learn how lineage system tables can be used for impact analysis and to gain usage insights across your data estate. We’ll cover newly released capabilities — such as Bring Your Own Lineage — that enable an end-to-end view of your data and AI assets in Unity Catalog. Plus, get a sneak peek at what’s coming next on the lineage roadmap!

Unlocking the Databricks Marketplace: A Hands-On Guide for Data Consumers and Providers

Curious about how to get real value from the Databricks Marketplace—whether you're consuming data or sharing it? This demo-heavy session answers the top 10 questions we hear from both data consumers and providers, with real examples you can put into practice right away. We’ll show consumers how to find the right product listing whether that's tables, files, AI models, solution accelerators, or Partner Connect integrations, try them out using sample notebooks, and access them with ease. You’ll also see how the Private Marketplace helps teams work more efficiently with a curated catalog of approved data. For providers, learn how to list your product in a way that stands out, use notebooks and documentation to help users get started, reach new audiences, and securely share data across your company or with trusted partners using the Private Marketplace. If you’ve ever asked, “How do I get started?” or “How do I make my data available internally or externally?”—this session has the answers, with demos to match.

What’s New in Databricks SQL: Latest Features and Live Demos

Databricks SQL has added significant features in the last year at a fast pace. This session will share the most impactful features and the customer use cases that inspired them. We will highlight the new SQL editor, SQL coding features, streaming tables and materialized views, BI integrations, cost management features, system tables and observability features, and more. We will also share AI-powered performance optimizations.

Kill Bill-ing? Revenge is a Dish Best Served Optimized with GenAI

In an era where cloud costs can spiral out of control, Sportsbet achieved a remarkable 49% reduction in Total Cost of Ownership (TCO) through an innovative AI-powered solution called 'Kill Bill.' This presentation reveals how we transformed Databricks' consumption-based pricing model from a challenge into a strategic advantage through an intelligent automation and optimization. Understand how to use GenAI to reduce Databricks TCO Leverage generative AI within Databricks solutions enables automated analysis of cluster logs, resource consumption, configurations, and codebases to provide Spark optimization suggestions Create AI agentic workflows by integrating Databricks' AI tools and Databricks Data Engineering tools Review a case study demonstrating how Total Cost of Ownership was reduced in practice. Attendees will leave with a clear understanding of how to implement AI within Databricks solutions to address similar cost challenges in their environments.

Solving Health AI’s Data Problem

AI in healthcare has a data problem. Fragmented data remains one of the biggest challenges, and bottlenecks the development and deployment of AI solutions across life sciences, payers, and providers. Legacy paper-driven workflows and fragmented technology perpetuate silos, making it difficult to create a comprehensive, real-time picture of patient health. Datavant is leveraging Databricks and AWS technology to solve this problem at scale. Through our partnership with Databricks, we are centralizing storage of clinical data from what is arguably the largest health data network so that we can transform it into structured, AI-ready data – and shave off 80 percent of the work of deploying a new AI use case. Learn how we are handling the complexity of this effort while preserving the integrity of source data. We’ll also share early use cases now available to our healthcare customers.

Sponsored by: Dagster Labs | The Age of AI is Changing Data Engineering for Good

The last major shift in data engineering came during the rise of the cloud, transforming how we store, manage, and analyze data. Today, we stand at the cusp of the next revolution: AI-driven data engineering. This shift promises not just faster pipelines, but a fundamental change in the way data systems are designed and maintained. AI will redefine who builds data infrastructure, automating routine tasks, enabling more teams to contribute to data platforms, and (if done right) freeing up engineers to focus on higher-value work. However, this transformation also brings heightened pressure around governance, risk, and data security, requiring new approaches to control and oversight. For those prepared, this is a moment of immense opportunity – a chance to embrace a future of smarter, faster, and more responsive data systems.

Sponsored by: DataHub | Beyond the Lakehouse: Supercharging Databricks with Contextual Intelligence

While Databricks powers your data lakehouse, DataHub delivers the critical context layer connecting your entire ecosystem. We'll demonstrate how DataHub extends Unity Catalog to provide comprehensive metadata intelligence across platforms. DataHub's real-time platform:Cut AI model time-to-market with our unified REST and GraphQL APIs that ensure models train on reliable and compliant data from across platforms, with complete lineage trackingDecrease data incidents by 60% using our event-driven architecture that instantly propagates changes across systems*Transform data discovery from days to minutes with AI-powered search and natural language interfaces.Leaders use DataHub to transform Databricks data into integrated insights that drive business value. See our demo of syncback technology—detecting sensitive data and enforcing Databricks access controls automatically—plus our AI assistant that enhances' LLMs with cross-platform metadata.

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

Data residency laws and legal mandates are driving the need for lakehouses across public and private clouds. This sprawl threatens centralized governance and compliance, while impacting cost, performance, and analytics/AI functionality. This session shows how e6data extends Unity Catalog across hybrid environments for consistent policy enforcement and query execution—regardless of data location—with guarantees around network egress, entitlements, performance, scalability, and cost. Learn how e6data’s “zero-data movement” philosophy powers a cost- and latency-optimized, location-aware architecture. We’ll cover onboarding strategies for hybrid fleets that enforce data movement restrictions and stay close to the data for better performance and lower cost. Discover how a location-aware compute strategy enables hybrid lakehouses with four key value metrics: cross-platform functionality, governed access, low latency, and total cost of ownership.

Sponsored by: Retool | Retooling Intelligence: Build Scalable, Secure AI Agents for the Enterprise with Databricks + Retool

Enterprises need AI agents that are both powerful and production-ready while being scalable and secure. In this lightning session, you’ll learn how to leverage Retool’s platform and Databricks to design, deploy, and manage intelligent agents that automate complex workflows. We’ll cover best practices for integrating real-time Databricks data, enforcing governance, and ensuring scalability all while avoiding common pitfalls. Whether you’re automating internal ops or customer-facing tasks, walk away with a blueprint for shipping AI agents that actually work in the real world.

Summit Live: AI/BI Genie & Dashboards - Talk With Your Data With GenAI Powered Business Intelligence

AI/BI Genie lets anyone simply talk with their own data, using natural language, fully secured through UC to provide accurate answers within the context for your organization. AI/BI Dashboards goes beyond traditional BI tools, democratizing everyone to self-serve immediate interactive visuals on your own secured data. Hear from a customer and Databricks experts on the latest developments.

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

Growth in capital markets thrives on innovation, agility and real-time insights. This session highlights how leading firms use Databricks’ Data Intelligence Platform to uncover opportunities, optimize trading strategies and deliver personalized client experiences. Learn how advanced analytics and AI help organizations expand their reach, improve decision-making and unlock new revenue streams. Industry leaders share how unified data platforms break down silos, deepen insights and drive success in a fast-changing market. Key takeaways: Predictive analytics and machine learning strategies for growth Real-world examples of optimized trading and enhanced client engagement Tools to innovate while ensuring operational efficiency Discover how data intelligence empowers capital markets firms to thrive in today’s competitive landscape!

Agentic Architectures to Create Realistic Conversations: Using GenAI to Teach Empathy in Healthcare

Medical providers often receive less than 15 minutes of instruction in how to interact with patients during emotionally charged end of life interactions. Continuing education for clinicians is critical to hone these skills but is difficult to scale traditional approaches that require professional patients and instructors. Here, we describe a custom chatbot that plays the role of patient and coach to provide a scaling learning experience. A critical challenge was how to mitigate the persistently cheerful and helpful tone which results from standard pretraining in the Patient Persona AI. We accomplished this by implementing a multi-agent architecture based upon a graphical model of the conversation. System prompts reflecting the patient’s cognitive state are dynamically updated as the conversation progresses. Future extensions of the work are intended to focus on additional custom model fine-tuning in the Mosaic AI platform to further improve the realism of the conversation.

AI-Powered Data Discovery and Curation With Unity Catalog

This session is repeated. In today’s data landscape, the challenge isn’t just storing or processing data — it’s enabling every user, from data stewards to analysts, to find and trust the right data, fast. This session explores how Databricks is reimagining data discovery with the new Discover Page Experience — an intuitive, curated interface showcasing key data and workspace assets. We’ll dive into AI-assisted governance and AI-powered discovery features like AI-generated metadata, AI-assisted lineage and natural language data exploration in Unity Catalog. Plus, see how new certifications and deprecations bring clarity to complex data environments. Whether you’re a data steward highlighting trusted assets or an analyst navigating data without deep schema knowledge, this session will show how Databricks is making data discovery seamless for everyone.

Cutting Costs, Not Performance: Optimizing Databricks at Scale

As Databricks transforms data processing, analytics and machine learning, managing platform costs has become crucial for organizations aiming to maximize value while staying within budget. While Databricks offers unmatched scalability and performance, inefficient usage can lead to unexpected cost overruns. This presentation will explore common challenges organizations face in controlling Databricks costs and provide actionable best practices for optimizing resource allocation, preventing over-provisioning and eliminating underutilization. Drawing from NTT DATA’s experience, I'll share how we reduced Databricks costs by up to 50% through strategies like choosing the right compute resource, leveraging manage tables and using Unity Catalog features, such as system tables, to monitor consumption. Join this session to gain practical insights and tools that will empower your team to optimize Databricks without overspending.

Databricks AI Factory Transforming Seven West Media

The implementation of the Databricks AI Factory enabled Seven West Media to transform its business by accelerating the launch of AI-driven use cases, fostering innovation and reducing time to market. By leveraging a unified data and AI platform, the company achieved better ROI through optimized workflows, improved operational efficiency and scalable machine learning models. The AI Factory empowered data teams to experiment faster, unlocking deeper audience insights that enhanced engagement and content personalization. This transformation positioned Seven West Media as a leader in AI-driven media, driving measurable business impact and future-proofing its data strategy.

Databricks Lakeflow: the Foundation of Data + AI Innovation for Your Industry

Every analytics, BI and AI project relies on high-quality data. This is why data engineering, the practice of building reliable data pipelines that ingest and transform data, is consequential to the success of these projects. In this session, we'll show how you can use Lakeflow to accelerate innovation in multiple parts of the organization. We'll review real-world examples of Databricks customers using Lakeflow in different industries such as automotive, healthcare and retail. We'll touch on how the foundational data engineering capabilities Lakeflow provides help power initiatives that improve customer experiences, make real-time decisions and drive business results.

In this session, we’ll introduce Zerobus Direct Write API, part of Lakeflow Connect, which enables you to push data directly to your lakehouse and simplify ingestion for IOT, clickstreams, telemetry, and more. We’ll start with an overview of the ingestion landscape to date. Then, we'll cover how you can “shift left” with Zerobus, embedding data ingestion into your operational systems to make analytics and AI a core component of the business, rather than an afterthought. The result is a significantly simpler architecture that scales your operations, using this new paradigm to skip unnecessary hops. We'll also highlight one of our early customers, Joby Aviation and how they use Zerobus. Finally, we’ll provide a framework to help you understand when to use Zerobus versus other ingestion offerings—and we’ll wrap up with a live Q&A so that you can hit the ground running with your own use cases.

Evolving Agent Complexity: Building Multi-Agent Systems With Mosaic AI
talk
by Shanduojiao Jiang (Greenlight Financial Technology) , Tim Mullins (Greenlight Financial Technology)

This session dives into building multi-agent systems on the Mosaic AI Platform, exploring the techniques, architectures and lessons learned from experiences building Greenlight’s real-world agent applications. This presentation is well suited for executives, product managers and engineers alike, breaking down AI Agents into easy-to-understand concepts, while presenting an architecture for building complex systems. We’ll examine the core components of generative AI Agents and different ways to assemble them into agents, including different prompting and reasoning techniques. We’ll cover how the Mosaic AI Platform has enabled our small team to build, deploy and monitor our AI Agents, touching on vector search, feature and model serving endpoints, and the evaluation framework. Finally, we’ll discuss the pros and cons of building a multi-agent system consisting of specialized agents vs. a single large agent for Greenlight’s AI Assistant, and the challenges we encountered.

From Days to Minutes - AI Transforms Audit at KPMG

Imagine performing complex regulatory checks in minutes instead of days. We made this a reality using GenAI on the Databricks Data Intelligence Platform. Join us for a deep dive into our journey from POC to a production-ready AI audit tool. Discover how we automated thousands of legal requirement checks in annual reports with remarkable speed and accuracy. Learn our blueprint for: High-Performance AI: Building a scalable, >90% accurate AI system with an optimized RAG pipeline that auditors praise. Robust Productionization: Achieving secure, governed deployment using Unity Catalog, MLflow, LLM-based evaluation, and MLOps best practices. This session provides actionable insights for deploying impactful, compliant GenAI in the enterprise.

This session is repeated. This introductory workshop caters to data engineers seeking hands-on experience and data architects looking to deepen their knowledge. The workshop is structured to provide a solid understanding of the following data engineering and streaming concepts: Introduction to Lakeflow and the Data Intelligence Platform Getting started with Lakeflow Declarative Pipelines for declarative data pipelines in SQL using Streaming Tables and Materialized Views Mastering Databricks Workflows with advanced control flow and triggers Understanding serverless compute Data governance and lineage with Unity Catalog Generative AI for Data Engineers: Genie and Databricks Assistant We believe you can only become an expert if you work on real problems and gain hands-on experience. Therefore, we will equip you with your own lab environment in this workshop and guide you through practical exercises like using GitHub, ingesting data from various sources, creating batch and streaming data pipelines, and more.