talk-data.com talk-data.com

Topic

Databricks

big_data analytics spark

1286

tagged

Activity Trend

515 peak/qtr
2020-Q1 2026-Q1

Activities

1286 activities · Newest first

Using Catalogs for a Well-Governed and Efficient Data Ecosystem

The ability to enforce data management controls at scale and reduce the effort required to manage data pipelines is critical to operating efficiently. Capital One has scaled its data management capabilities and invested in platforms to help address this need. In the past couple of years, the role of “the catalog” in a data platform architecture has transitioned from just providing SQL to providing a full suite of capabilities that can help solve this problem at scale. This talk will give insight into how Capital One is thinking about leveraging Databricks Unity Catalog to help tackle these challenges.

Break the Ice: Your Guide to the AccuWeather Data Suite in Databricks

AccuWeather harnesses cutting-edge technology, industry-leading weather data, and expert insights to empower businesses and individuals worldwide. In this session, we will explore how AccuWeather’s comprehensive datasets—ranging from historical and current conditions to forecasts and climate normals—can drive real-world impact across diverse industries. By showcasing scenario-based examples, we’ll demonstrate how AccuWeather’s hourly and daily weather data can address the unique needs of your organization, whether for operational planning, risk management, or strategic decision-making. This session is ideal for both newcomers to AccuWeather’s offerings and experienced users seeking to unlock the full potential of our weather data to optimize performance, improve efficiency, and boost overall success.

Dealing With Sensitive Data on Databricks at Natura

Ensuring the protection of sensitive data within a Databricks environment requires robust mechanisms to prevent unauthorized access, even by high-privileged roles such as Databricks Administrators: Account Console Admins, Workspace Admins, and Unity Catalog Admins. To address this, a comprehensive data governance and access control strategy can be implemented, leveraging encryption, secret scope, column mask, fine-grained access on tables and auditing capabilities.

Gaining Insight From Image Data in Databricks Using Multi-Modal Foundation Model API

Unlock the hidden potential in your image data without specialized computer vision expertise! This session explores how to leverage Databricks' multi-modal Foundation Model APIs to analyze, classify and extract insights from visual content. Learn how Databricks provides a unified API to understand images using powerful foundation models within your data workflows. Key takeaways: Implementing efficient workflows for image data processing within your Databricks lakehouse Understanding multi-modal foundation models for image understanding Integrating image analysis with other data types for business insights Using OpenAI-compatible APIs to query multi-modal models Building end-to-end pipelines from image ingestion to model deployment Whether analyzing product images, processing visual documents or building content moderation systems, you'll discover how to extract valuable insights from your image data within the Databricks ecosystem.

Improving User Experience and Efficiency Using DBSQL

To scale Databricks SQL to 2,000 users efficiently and cost-effectively, we adopted serverless, ensuring dynamic scalability and resource optimization. During peak times, resources scale up automatically; during low demand, they scale down, preventing waste. Additionally, we implemented a strong content governance model. We created continuous monitoring to assess query and dashboard performance, notifying users about adjustments and ensuring only relevant content remains active. If a query exceeds time or impact limits, access is reviewed and, if necessary, deactivated. This approach brought greater efficiency, cost reduction and an improved user experience, keeping the platform well-organized and high-performing.

Powering Personalization at Scale with Data: How T-Mobile and Deep Sync Help Brands Connect with Consumers

Discover how T-Mobile and Deep Sync are redefining personalized marketing through the power of Databricks. Deep Sync, a leader in deterministic identity solutions, has brought its identity spine to Databricks Lakehouse, which covers over 97% of U.S. households with the most current and accurate attribute data available. T-Mobile is bringing to market for the first time a new data services business that introduces privacy-compliant, consent-based consumer data. Together, T-Mobile and Deep Sync are transforming how brands engage with consumers—enabling bespoke, hyper-personalized workflows, identity-driven insights, and closed-loop measurement through Databricks’ Multi-Party Cleanrooms. Join this session to learn how data and identity are converging to solve today’s modern marketing challenges so consumers can rediscover what it feels like to be seen, not targeted

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

In this session, you’ll see how to build and deploy a GenAI agent and Model Context Protocol (MCP) with Databricks, Anthropic, Mosaic External AI Gateway, and Amazon Bedrock. You will learn the architecture, best-practices of using Databricks Mosaic AI, Anthropic Sonnet 3.7 first-party frontier model, and LangGraph for custom workflow orchestration in Databricks Data Intelligence Platform. You’ll also see how to use Databricks Mosaic AI to provide agent evaluation and monitoring. In addition, you will also see how inline agent will use MCP to provide tools and other resources using Amazon Nova models with Amazon Bedrock inline agent for deep research. This approach gives you the flexibility of LangGraph, the powerful managed agents offered by Amazon Bedrock, and Databricks Mosaic AI’s operational support for evaluation and monitoring.

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: Scaling data platforms to support real-time analytics and AI-driven decision-making Democratizing access to data while maintaining robust governance and security Harnessing openness and portability to enable seamless collaboration with partners and customers After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!

This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines. The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Databricks Asset Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Databricks Asset Bundles for multiple environments with different configurations using the Databricks CLI. Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Databricks Asset Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Databricks Asset Bundles. By the end of this course, you will be equipped to automate Databricks project deployments with Databricks Asset Bundles, improving efficiency through DevOps practices. Pre-requisites: Strong knowledge of the Databricks platform, including experience with Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture, Unity Catalog, Delta Live Tables, and Workflows. In particular, knowledge of leveraging Expectations with Lakeflow Declarative Pipelines. Labs : Yes Certification Path: Databricks Certified Data Engineer Professional

De-Risking Investment Decisions: QCG's Smarter Deal Evaluation Process Leveraging Databricks

Quantum Capital Group (QCG) screens hundreds of deals across the global Sustainable Energy Ecosystem, requiring deep technical due diligence. With over 1.5 billion records sourced from public, premium and proprietary datasets, their challenge was how to efficiently curate, analyze and share this data to drive smarter investment decisions. QCG partnered with Databricks & Tiger Analytics to modernize its data landscape. Using Delta tables, Spark SQL, and Unity Catalog, the team built a golden dataset that powers proprietary evaluation models and automates complex workflows. Data is now seamlessly curated, enriched and distributed — both internally and to external stakeholders — in a secure, governed and scalable way. This session explores how QCG’s investment in data intelligence has turned an overwhelming volume of information into a competitive advantage, transforming deal evaluation into a faster, more strategic process.

This course introduces learners to deploying, operationalizing, and monitoring generative artificial intelligence (AI) applications. First, learners will develop knowledge and skills in deploying generative AI applications using tools like Model Serving. Next, the course will discuss operationalizing generative AI applications following modern LLMOps best practices and recommended architectures. Finally, learners will be introduced to the idea of monitoring generative AI applications and their components using Lakehouse Monitoring. Pre-requisites: Familiarity with prompt engineering and retrieval-augmented generation (RAG) techniques, including data preparation, embeddings, vectors, and vector databases. A foundational knowledge of Databricks Data Intelligence Platform tools for evaluation and governance (particularly Unity Catalog). Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

This course will guide participants through a comprehensive exploration of machine learning model operations, focusing on MLOps and model lifecycle management. The initial segment covers essential MLOps components and best practices, providing participants with a strong foundation for effectively operationalizing machine learning models. In the latter part of the course, we will delve into the basics of the model lifecycle, demonstrating how to navigate it seamlessly using the Model Registry in conjunction with the Unity Catalog for efficient model management. By the course's conclusion, participants will have gained practical insights and a well-rounded understanding of MLOps principles, equipped with the skills needed to navigate the intricate landscape of machine learning model operations. Pre-requisites: Familiarity with Databricks workspace and notebooks, familiarity with Delta Lake and Lakehouse, intermediate level knowledge of Python (e.g. understanding of basic MLOps concepts and practices as well as infrastructure and importance of monitoring MLOps solutions) Labs: Yes Certification Path: Databricks Certified Machine Learning Associate

MLOps With Databricks

Adopting MLOps is getting increasingly important with the rise of AI. A lot of different features are required to do MLOps in large organizations. In the past, you had to implement these features yourself. Luckily, the MLOps space is getting more mature, and end-to-end platforms like Databricks provide most of the features. In this talk, I will walk through the MLOps components and how you can simplify your processes using Databricks. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Pushing the Limits of What Your Warehouse Can Do Using Python and Databricks

SQL warehouses in Databricks can run more than just SQL. Join this session to learn how to get more out of your SQL warehouses and any tools built on top of it by leveraging Python. After attending this session, you will be familiar with Python user-defined functions and how to bring in custom dependencies from PyPi, as a custom wheel or even securely invoke cloud services with performance at scale.

ReguBIM AI – Transforming BIM, Engineering, and Code Compliance with Generative AI

At Exyte, we design, engineer, and deliver ultra-clean and sustainable facilities for high-tech industries. One of the most complex tasks our engineers and designers face is ensuring that their building designs comply with constantly evolving codes and regulations – often a manual, error-prone process. To address this, we developed ReguBIM AI, a generative AI-powered assistant that helps our teams verify code compliance more efficiently and accurately by linking 3D Building Information Modeling (BIM) data with regulatory documents. Built on the Databricks Data Intelligence Platform, ReguBIM AI is part of our broader vision to apply AI meaningfully across engineering and design processes. We are proud to share that ReguBIM AI won the Grand Prize and EMEA Winner titles at the Databricks GenAI World Cup 2024 — a global hackathon that challenged over 1,500 data scientists and AI engineers from 18 countries to create innovative generative AI solutions for real-world problems.

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

This lightning talk dives into real-world GenAI projects that scaled from prototype to production using Databricks’ fully managed tools. Facing cost and time constraints, we leveraged four key Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build an AI inference pipeline processing millions of documents (text and audiobooks). This approach enables rapid experimentation, easy tuning of GenAI prompts and compute settings, seamless data iteration and efficient quality testing—allowing Data Scientists and Engineers to collaborate effectively. Learn how to design modular, parameterized notebooks that run concurrently, manage dependencies and accelerate AI-driven insights. Whether you're optimizing AI inference, automating complex data workflows or architecting next-gen serverless AI systems, this session delivers actionable strategies to maximize performance while keeping costs low.

Sponsored by: EY | Xoople: Fueling enterprise AI with Earth data intelligence products

Xoople aims to provide its users with trusted AI-Ready Earth data and accelerators that unlock new insights for enterprise AI. With access to scientific-grade Earth data that provides spatial intelligence on real-world changes, data scientists and BI analysts can increase forecast accuracy for their enterprise processes and models. These improvements drive smarter, data-driven business decisions across various business functions, including supply chain, finance, and risk across industries. Xoople, which has recently introduced their product, Enterprise AI-Ready Earth Data™, on the Databricks Marketplace, will have their CEO, Fabrizio Pirondini, discuss the importance of the Databricks Data Intelligence Platform in making Xoople’s product a reality for use in the enterprise.

Sponsored by: Impetus | Supercharge AI with automated migration to Databricks with Impetus

Migrating legacy workloads to a modern, scalable platform like Databricks can be complex and resource-intensive. Impetus, an Elite Databricks Partner and the Databricks Migration Partner of the Year 2024, simplifies this journey with LeapLogic, an automated solution for data platform modernization and migration services. LeapLogic intelligently discovers, transforms, and optimizes workloads for Databricks, ensuring minimal risk and faster time-to-value. In this session, we’ll showcase real-world success stories of enterprises that have leveraged Impetus’ LeapLogic to modernize their data ecosystems efficiently. Join us to explore how you can accelerate your migration journey, unlock actionable insights, and future-proof your analytics with a seamless transition to Databricks.