talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

509

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
Managing Data and AI Security Risks With DASF 2.0 — and a Customer Story

The Databricks Security team led a broad working group that significantly evolved the Databricks AI Security Framework (DASF) to its 2.0 version since its first release by closely collaborating with the top cyber security researchers at industry organizations such as OWASP, Gartner, NIST, HITRUST, FAIR Institute and several Fortune 100 companies to address the evolving risks and associated controls of AI systems in enterprises. Join us to to learn how The CLEVER GenAI pipeline, an AI-driven innovation in healthcare, processes over 1.5 million clinical notes daily to classify social determinants impacting veteran care while adhering to robust security measures like NIST 800-53 controls and by leveraging Databricks AI Security Framework. We will discuss robust AI security guidelines to help data and AI teams understand how to deploy their AI applications securely. This session will give a security framework for security teams, AI practitioners, data engineers and governance teams.

Real-Time Market Insights — Powering Optiver’s Live Trading Dashboard with Databricks Apps and Dash

In the fast-paced world of trading, real-time insights are critical for making informed decisions. This presentation explores how Optiver, a leading high-frequency trading firm, harnesses Databricks apps to power its live trading dashboards. The technology enables traders to analyze market data, detect patterns and respond instantly. In this talk, we will showcase how our system leverages Databricks’ scalable infrastructures such as Structured Streaming to efficiently handle vast streams of financial data while ensuring low-latency performance. In addition, we will show how the integration of Databricks apps with Dash has empowered traders to rapidly develop and deploy custom dashboards, minimizing dependency on developers. Attendees will gain insights into our architecture, data processing techniques and lessons learned in integrating Databricks apps with Dash in order to drive rapid, data-driven trading decisions.

ServiceNow ‘Walks the Talk’ With Databricks: Revolutionizing Go-To-Market With AI

At ServiceNow, we’re not just talking about AI innovation — we’re delivering it. By harnessing the power of Databricks, we’re reimagining Go-To-Market (GTM) strategies, seamlessly integrating AI at every stage of the deal journey — from identifying high-value leads to generating hyper-personalized outreach and pitch materials. In this session, learn how we’ve slashed data processing times by over 90%, reducing workflows from an entire day to just 30 minutes with Databricks. This unprecedented speed enables us to deploy AI-driven GTM initiatives faster, empowering our sellers with real-time insights that accelerate deal velocity and drive business growth. As Agentic AI becomes a game-changer in enterprise GTM, ServiceNow and Databricks are leading the charge — paving the way for a smarter, more efficient future in AI-powered sales.

Sponsored by: Deloitte | Advancing AI in Cybersecurity with Databricks & Deloitte: Data Management & Analytics

Deloitte is observing a growing trend among cybersecurity organizations to develop big data management and analytics solutions beyond traditional Security Information and Event Management (SIEM) systems. Leveraging Databricks to extend these SIEM capabilities, Deloitte can help clients lower the cost of cyber data management while enabling scalable, cloud-native architectures. Deloitte helps clients design and implement cybersecurity data meshes, using Databricks as a foundational data lake platform to unify and govern security data at scale. Additionally, Deloitte extends clients’ cybersecurity capabilities by integrating advanced AI and machine learning solutions on Databricks, driving more proactive and automated cybersecurity solutions. Attendees will gain insight into how Deloitte is utilizing Databricks to manage enterprise cyber risks and deliver performant and innovative analytics and AI insights that traditional security tools and data platforms aren’t able to deliver.

SQL-First ETL: Building Easy, Efficient Data Pipelines With Lakeflow Declarative Pipelines

This session explores how SQL-based ETL can accelerate development, simplify maintenance and make data transformation more accessible to both engineers and analysts. We'll walk through how Databricks Lakeflow Declarative Pipelines and Databricks SQL warehouse support building production-grade pipelines using familiar SQL constructs.Topics include: Using streaming tables for real-time ingestion and processing Leveraging materialized views to deliver fast, pre-computed datasets Integrating with tools like dbt to manage batch and streaming workflows at scale By the end of the session, you’ll understand how SQL-first approaches can streamline ETL development and support both operational and analytical use cases.

Unifying Data Delivery: Using Databricks as Your Enterprise Serving Layer

This session will take you on our journey of integrating Databricks as the core serving layer in a large enterprise, demonstrating how you can build a unified data platform that meets diverse business needs. We will walk through the steps for constructing a central serving layer by leveraging Databricks’ SQL Warehouse to efficiently deliver data to analytics tools and downstream applications. To tackle low latency requirements, we’ll show you how to incorporate an interim scalable relational database layer that delivers sub-second performance for hot data scenarios. Additionally, we’ll explore how Delta Sharing enables secure and cost-effective data distribution beyond your organization, eliminating silos and unnecessary duplication for a truly end-to-end centralized solution. This session is perfect for data architects, engineers and decision-makers looking to unlock the full potential of Databricks as a centralized serving hub.

Red Stapler is a streaming-native system on Databricks that merges file-based ingestion and real-time user edits into one Lakeflow Declarative Pipelines for near real-time feedback. Protobuf definitions, managed in the Buf Schema Registry (BSR), govern schema and data-quality rules, ensuring backward compatibility. All records — valid or not — are stored in an SCD Type 2 table, capturing every version for full history and immediate quarantine views of invalid data. This unified approach boosts data governance, simplifies auditing and streamlines error fixes.Running on Lakeflow Declarative Pipelines Serverless and the Kafka-compatible Bufstream keeps costs low by scaling down to zero when idle. Red Stapler’s configuration-driven Protobuf logic adapts easily to evolving survey definitions without risking production. The result is consistent validation, quick updates and a complete audit trail — all critical for trustworthy, flexible data pipelines.

Unity Catalog Upgrades Made Easy. Step-by-Step Guide for Databricks Labs UCX

The Databricks labs project UCX aims to optimize the Unity Catalog (UC) upgrade process, ensuring a seamless transition for businesses. This session will delve into various aspects of the UCX project including the installation and configuration of UCX, the use of the UCX Assessment Dashboard to reduce upgrade risks and prepare effectively for a UC upgrade, and the automation of key components such as group, table and code migration. Attendees will gain comprehensive insights into leveraging UCX and Lakehouse Federation for a streamlined and efficient upgrade process. This session is aimed at customers new to UCX as well as veterans.

Using Catalogs for a Well-Governed and Efficient Data Ecosystem

The ability to enforce data management controls at scale and reduce the effort required to manage data pipelines is critical to operating efficiently. Capital One has scaled its data management capabilities and invested in platforms to help address this need. In the past couple of years, the role of “the catalog” in a data platform architecture has transitioned from just providing SQL to providing a full suite of capabilities that can help solve this problem at scale. This talk will give insight into how Capital One is thinking about leveraging Databricks Unity Catalog to help tackle these challenges.

Break the Ice: Your Guide to the AccuWeather Data Suite in Databricks

AccuWeather harnesses cutting-edge technology, industry-leading weather data, and expert insights to empower businesses and individuals worldwide. In this session, we will explore how AccuWeather’s comprehensive datasets—ranging from historical and current conditions to forecasts and climate normals—can drive real-world impact across diverse industries. By showcasing scenario-based examples, we’ll demonstrate how AccuWeather’s hourly and daily weather data can address the unique needs of your organization, whether for operational planning, risk management, or strategic decision-making. This session is ideal for both newcomers to AccuWeather’s offerings and experienced users seeking to unlock the full potential of our weather data to optimize performance, improve efficiency, and boost overall success.

Dealing With Sensitive Data on Databricks at Natura

Ensuring the protection of sensitive data within a Databricks environment requires robust mechanisms to prevent unauthorized access, even by high-privileged roles such as Databricks Administrators: Account Console Admins, Workspace Admins, and Unity Catalog Admins. To address this, a comprehensive data governance and access control strategy can be implemented, leveraging encryption, secret scope, column mask, fine-grained access on tables and auditing capabilities.

Gaining Insight From Image Data in Databricks Using Multi-Modal Foundation Model API

Unlock the hidden potential in your image data without specialized computer vision expertise! This session explores how to leverage Databricks' multi-modal Foundation Model APIs to analyze, classify and extract insights from visual content. Learn how Databricks provides a unified API to understand images using powerful foundation models within your data workflows. Key takeaways: Implementing efficient workflows for image data processing within your Databricks lakehouse Understanding multi-modal foundation models for image understanding Integrating image analysis with other data types for business insights Using OpenAI-compatible APIs to query multi-modal models Building end-to-end pipelines from image ingestion to model deployment Whether analyzing product images, processing visual documents or building content moderation systems, you'll discover how to extract valuable insights from your image data within the Databricks ecosystem.

Improving User Experience and Efficiency Using DBSQL

To scale Databricks SQL to 2,000 users efficiently and cost-effectively, we adopted serverless, ensuring dynamic scalability and resource optimization. During peak times, resources scale up automatically; during low demand, they scale down, preventing waste. Additionally, we implemented a strong content governance model. We created continuous monitoring to assess query and dashboard performance, notifying users about adjustments and ensuring only relevant content remains active. If a query exceeds time or impact limits, access is reviewed and, if necessary, deactivated. This approach brought greater efficiency, cost reduction and an improved user experience, keeping the platform well-organized and high-performing.

Powering Personalization at Scale with Data: How T-Mobile and Deep Sync Help Brands Connect with Consumers

Discover how T-Mobile and Deep Sync are redefining personalized marketing through the power of Databricks. Deep Sync, a leader in deterministic identity solutions, has brought its identity spine to Databricks Lakehouse, which covers over 97% of U.S. households with the most current and accurate attribute data available. T-Mobile is bringing to market for the first time a new data services business that introduces privacy-compliant, consent-based consumer data. Together, T-Mobile and Deep Sync are transforming how brands engage with consumers—enabling bespoke, hyper-personalized workflows, identity-driven insights, and closed-loop measurement through Databricks’ Multi-Party Cleanrooms. Join this session to learn how data and identity are converging to solve today’s modern marketing challenges so consumers can rediscover what it feels like to be seen, not targeted

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

In this session, you’ll see how to build and deploy a GenAI agent and Model Context Protocol (MCP) with Databricks, Anthropic, Mosaic External AI Gateway, and Amazon Bedrock. You will learn the architecture, best-practices of using Databricks Mosaic AI, Anthropic Sonnet 3.7 first-party frontier model, and LangGraph for custom workflow orchestration in Databricks Data Intelligence Platform. You’ll also see how to use Databricks Mosaic AI to provide agent evaluation and monitoring. In addition, you will also see how inline agent will use MCP to provide tools and other resources using Amazon Nova models with Amazon Bedrock inline agent for deep research. This approach gives you the flexibility of LangGraph, the powerful managed agents offered by Amazon Bedrock, and Databricks Mosaic AI’s operational support for evaluation and monitoring.

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: Scaling data platforms to support real-time analytics and AI-driven decision-making Democratizing access to data while maintaining robust governance and security Harnessing openness and portability to enable seamless collaboration with partners and customers After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!

This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines. The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Databricks Asset Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Databricks Asset Bundles for multiple environments with different configurations using the Databricks CLI. Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Databricks Asset Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Databricks Asset Bundles. By the end of this course, you will be equipped to automate Databricks project deployments with Databricks Asset Bundles, improving efficiency through DevOps practices. Pre-requisites: Strong knowledge of the Databricks platform, including experience with Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture, Unity Catalog, Delta Live Tables, and Workflows. In particular, knowledge of leveraging Expectations with Lakeflow Declarative Pipelines. Labs : Yes Certification Path: Databricks Certified Data Engineer Professional

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

Quantum Capital Group (QCG) screens hundreds of deals across the global Sustainable Energy Ecosystem, requiring deep technical due diligence. With over 1.5 billion records sourced from public, premium and proprietary datasets, their challenge was how to efficiently curate, analyze and share this data to drive smarter investment decisions. QCG partnered with Databricks & Tiger Analytics to modernize its data landscape. Using Delta tables, Spark SQL, and Unity Catalog, the team built a golden dataset that powers proprietary evaluation models and automates complex workflows. Data is now seamlessly curated, enriched and distributed — both internally and to external stakeholders — in a secure, governed and scalable way. This session explores how QCG’s investment in data intelligence has turned an overwhelming volume of information into a competitive advantage, transforming deal evaluation into a faster, more strategic process.