Azure

How to Build an Open Lakehouse: Best Practices for Interoperability

2025-06-11 · Data + AI Summit 2025 Watch

talk

by James Malone (Databricks) , Aniruth Narayanan (Databricks)

AWS BigQuery Cloud Computing Data Lakehouse GCP Microsoft Fabric Snowflake

Building an open data lakehouse? Start with the right blueprint. This session walks through common reference architectures for interoperable lakehouse deployments across AWS, Google Cloud, Azure and tools like Snowflake, BigQuery and Microsoft Fabric. Learn how to design for cross-platform data access, unify governance with Unity Catalog and ensure your stack is future-ready — no matter where your data lives.

What’s New in Security and Compliance on the Databricks Data Intelligence Platform

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Filippo Seracini (Databricks) , Suresh Thiru (Databricks)

AI/ML AWS Cloud Computing Databricks GCP Cyber Security SQL

In this session, we’ll walk through the latest advancements in platform security and compliance on Databricks — from networking updates to encryption, serverless security and new compliance certifications across AWS, Azure and Google Cloud. We’ll also share our roadmap and best practices for how to securely configure workloads on Databricks SQL Serverless, Unity Catalog, Mosaic AI and more — at scale. If you're building on Databricks and want to stay ahead of evolving risk and regulatory demands, this session is your guide.

Unlocking Access: Simplifying Identity Management at Scale With Databricks

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Keegan Dubbs (Databricks) , Hari Selvarajan (Databricks)

Cloud Computing Databricks GCP

Effective Identity and Access Management (IAM) is essential for securing enterprise environments while enabling innovation and collaboration. As companies scale, ensuring users have the right access without adding administrative overhead is critical. In this session, we’ll explore how Databricks is simplifying identity management by integrating with customers’ Identity Providers (IDPs). Learn about Automatic Identity Management in Azure Databricks, which eliminates SCIM for Entra ID users and ensures scalable identity provisioning for other IDPs. We'll also cover externally managed groups, PIM integration and upcoming enhancements like a bring-your-own-IDP model for Google Cloud. Through a customer success story and live demo, see how Databricks is making IAM more scalable, secure and user-friendly.

Unified Advanced Analytics: Integrating Power BI and Databricks Genie for Real-time Insights

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Justin Ward (TurnPoint Services) , Edelweiss Kammermann (IT Convergence)

Analytics API BI Dashboard Databricks Power BI

In today’s data-driven landscape, business users expect seamless, interactive analytics without having to switch between different environments. This presentation explores our web application that unifies a Power BI dashboard with Databricks Genie, allowing users to query and visualize insights from the same dataset within a single, cohesive interface. We will compare two integration strategies: one that leverages a traditional webpage enhanced by an Azure bot to incorporate Genie’s capabilities, and another that utilizes Databricks Apps to deliver a smoother, native experience. We use the Genie API to build this solution. Attendees will learn the architecture behind these solutions, key design considerations and challenges encountered during implementation. Join us to see live demos of both approaches, and discover best practices for delivering an all-in-one, interactive analytics experience.

Enabling Sleep Science Research With Databricks and Delta Sharing

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Alexandr Rivlin (Sleep Number Labs) , Sajeev Mayandi (Sleep Number)

AWS Cloud Computing Databricks Delta GCP Jenkins Jira Cyber Security Terraform

Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Olivia Ren (Databricks) , Andrew Clarke (Australian Red Cross Lifeblood)

Analytics Data Lakehouse Data Vault Databricks Delta DWH SQL

In this session, we will explore the Australian Red Cross Lifeblood's approach to synchronizing an Azure SQL Datavault 2.0 (DV2.0) implementation with Unity Catalog (UC) using Lakeflow Connect. Lifeblood's DV2.0 data warehouse, which includes raw vault (RV) and business vault (BV) tables, as well as information marts defined as views, required a multi-step process to achieve data/business logic sync with UC. This involved using Lakeflow Connect to ingest RV and BV data, followed by a custom process utilizing JDBC to ingest view definitions, and the automated/manual conversion of T-SQL to Databricks SQL views, with Lakehouse Monitoring for validation. In this talk, we will share our journey, the design decisions we made, and how the resulting solution now supports analytics workloads, analysts, and data scientists at Lifeblood.

Lakeflow Declarative Pipelines Integrations and Interoperability: Get Data From — and to — Anywhere

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Ryan Nienhuis (Databricks)

API Cosmos Data Lakehouse Delta ETL/ELT Kafka MongoDB Python Spark

This session is repeated.In this session, you will learn how to integrate Lakeflow Declarative Pipelines with external systems in order to ingest and send data virtually anywhere. Lakeflow Declarative Pipelines is most often used in ingestion and ETL into the Lakehouse. New Lakeflow Declarative Pipelines capabilities like the Lakeflow Declarative Pipelines Sinks API and added support for Python Data Source and ForEachBatch have opened up Lakeflow Declarative Pipelines to support almost any integration. This includes popular Apache Spark™ integrations like JDBC, Kafka, External and managed Delta tables, Azure CosmosDB, MongoDB and more.

Smart Vehicles, Secure Data: Recreating Vehicle Environments for Privacy-Preserving Machine Learning

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Frankie Cancino (Mercedes-Benz R&D)

AI/ML Cloud Computing Cyber Security

As connected vehicles generate vast amounts of personal and sensitive data, ensuring privacy and security in machine learning (ML) processes is essential. This session explores how Trusted Execution Environments (TEEs) and Azure Confidential Computing can enable privacy-preserving ML in cloud environments. We’ll present a method to recreate a vehicle environment in the cloud, where sensitive data remains private throughout model training, inference and deployment. Attendees will learn how Mercedes-Benz R&D North America builds secure, privacy-respecting personalized systems for the next generation of connected vehicles.

Let's Save Tons of Money With Cloud-Native Data Ingestion!

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Tyler Croy (Scribd, Inc.)

Airbyte AWS Aurora Kinesis Cloud Computing Databricks Delta GCP Kafka

Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.

Deploying Databricks Asset Bundles (DABs) at Scale

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Saad Ansari (Databricks) , Pieter Noordhuis (Databricks)

AI/ML Azure DevOps BI Dashboard Databricks DevOps Git GitHub

This session is repeated.Managing data and AI workloads in Databricks can be complex. Databricks Asset Bundles (DABs) simplify this by enabling declarative, Git-driven deployment workflows for notebooks, jobs, Lakeflow Declarative Pipelines, dashboards, ML models and more.Join the DABs Team for a Deep Dive and learn about:The Basics: Understanding Databricks asset bundlesDeclare, define and deploy assets, follow best practices, use templates and manage dependenciesCI/CD & Governance: Automate deployments with GitHub Actions/Azure DevOps, manage Dev vs. Prod differences, and ensure reproducibilityWhat’s new and what's coming up! AI/BI Dashboard support, Databricks Apps support, a Pythonic interface and workspace-based deploymentIf you're a data engineer, ML practitioner or platform architect, this talk will provide practical insights to improve reliability, efficiency and compliance in your Databricks workflows.

Italgas’ AI Factory and the Future of Gas Distribution

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Nicola Giorcelli (Cluster Reply) , Delli, Serena (Italgas)

AI/ML BI Databricks GenAI SQL Synapse

At Italgas, Europe’s leading gas distributor both by network size and number of customers, we are spearheading digital transformation through a state-of-the-art, fully-fledged Databricks Intelligent platform. Achieved 50% cost reduction and 20% performance boost migrating from Azure Synapse to Databricks SQL Deployed 41 ML/GenAI models in production, with 100% of workloads governed by Unity Catalog Empowered 80% of employees with self-service BI through Genie Dashboards Enabled natural language queries for control-room operators analyzing network status The future of gas distribution is data-driven: predictive maintenance, automated operations, and real-time decision making are now realities. Our AI Factory isn't just digitizing infrastructure—it's creating a more responsive, efficient, and sustainable gas network that anticipates needs before they arise.

Sponsored by: Accenture & Avanade | Enterprise Data Journey for The Standard Insurance Leveraging Databricks on Azure and AI Innovation

Comprehensive Data Management and Governance With Azure Data Lake Storage

2025-06-10 · Data + AI Summit 2025 Watch

talk

by James Baker (Microsoft) , Santhosh Pillai (Microsoft Corporation)

Data Governance Data Lake Data Management Databricks

Given that data is the new oil, it must be treated as such. Organizations that pursue greater insight into their businesses and their customers must manage, govern, protect and observe the use of the data that drives these insights in an efficient, cost-effective, compliant and auditable manner without degrading access to that data. Azure Data Lake Storage offers many features which allow customers to apply such controls and protections to their critical data assets. Understanding how these features behave, the granularity, cost and scale implications and the degree of control or protection that they apply are essential to implement a data lake that reflects the value contained within. In this session, the various data protection, governance and management capabilities available now and upcoming in ADLS will be discussed. This will include how deep integration with Azure Databricks can provide a more comprehensive, end-to-end coverage for these concerns, yielding a highly efficient and effective data governance solution.

Sponsored by: Microsoft | Leverage the power of the Microsoft Ecosystem with Azure Databricks

Empowering Healthcare Insights: A Unified Lakehouse Approach With Databricks

2025-06-10 · Data + AI Summit 2025 Watch

talk

by BIANCA STRATULAT (BJSS) , Mike Dobing (Databricks)

AWS Data Lake Data Lakehouse Databricks Iceberg Cyber Security

NHS England is revolutionizing healthcare research by enabling secure, seamless access to de-identified patient data through the Federated Data Platform (FDP). Despite vast data resources spread across regional and national systems, analysts struggle with fragmented, inconsistent datasets. Enter Databricks: powering a unified, virtual data lake with Unity Catalog at its core — integrating diverse NHS systems while ensuring compliance and security. By bridging AWS and Azure environments with a private exchange and leveraging the Iceberg connector to interface with Palantir, analysts gain scalable, reliable and governed access to vital healthcare data. This talk explores how this innovative architecture is driving actionable insights, accelerating research and ultimately improving patient outcomes.

Tackling Data Challenges for Scaling Multi-Agent GenAI Apps with Python

2025-06-07 · PyData London 2025 Watch

talk

by Theo van Kraay (Microsoft)

API Cosmos GenAI LLM Python RAG

The use of multiple Large Language Models (LLMs) working together perform complex tasks, known as multi-agent systems, has gained significant traction. While orchestration frameworks like LangGraph and Semantic Kernel can streamline orchestration and coordination among agents, developing large-scale, production-grade systems can bring a host of data challenges. Issues such as supporting multi-tenancy, preserving transactional integrity and state, and managing reliable asynchronous function calls while scaling efficiently can be difficult to navigate.

Leveraging insights from practical experiences in the Azure Cosmos DB engineering team, this talk will guide you through key considerations and best practices for storing, managing, and leveraging data in multi-agent applications at any scale. You’ll learn how to understand core multi-agent concepts and architectures, manage statefulness and conversation histories, personalize agents through retrieval-augmented generation (RAG), and effectively integrate APIs and function calls.

Aimed at developers, architects, and data scientists at all skill levels, this session will show you how to take your multi-agent systems from the lab to full-scale production deployments, ready to solve real-world problems. We’ll also walk through code implementations that can be quickly and easily put into practice, all in Python.

50 Years of Microsoft and Developer Tools with Scott Guthrie

2025-06-04 · The Pragmatic Engineer Listen

podcast_episode

by Scott Guthrie (Microsoft) , Gergely Orosz

AI/ML Analytics C#/.NET Cloud Computing GitHub Linux Marketing Microsoft

Supported by Our Partners •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. •⁠ Sinch⁠ — Connect with customers at every step of their journey. •⁠ Modal⁠ — The cloud platform for building AI applications. — How has Microsoft changed since its founding in 1975, especially in how it builds tools for developers? In this episode of The Pragmatic Engineer, I sit down with Scott Guthrie, Executive Vice President of Cloud and AI at Microsoft. Scott has been with the company for 28 years. He built the first prototype of ASP.NET, led the Windows Phone team, led up Azure, and helped shape many of Microsoft’s most important developer platforms. We talk about Microsoft’s journey from building early dev tools to becoming a top cloud provider—and how it actively worked to win back and grow its developer base. In this episode, we cover: • Microsoft’s early years building developer tools • Why Visual Basic faced resistance from devs back in the day: even though it simplified development at the time • How .NET helped bring a new generation of server-side developers into Microsoft’s ecosystem • Why Windows Phone didn’t succeed • The 90s Microsoft dev stack: docs, debuggers, and more • How Microsoft Azure went from being the #7 cloud provider to the #2 spot today • Why Microsoft created VS Code • How VS Code and open source led to the acquisition of GitHub • What Scott’s excited about in the future of developer tools and AI • And much more! — Timestamps (00:00) Intro (02:25) Microsoft’s early years building developer tools (06:15) How Microsoft’s developer tools helped Windows succeed (08:00) Microsoft’s first tools were built to allow less technically savvy people to build things (11:00) A case for embracing the technology that’s coming (14:11) Why Microsoft built Visual Studio and .NET (19:54) Steve Ballmer’s speech about .NET (22:04) The origins of C# and Anders Hejlsberg’s impact on Microsoft (25:29) The 90’s Microsoft stack, including documentation, debuggers, and more (30:17) How productivity has changed over the past 10 years (32:50) Why Gergely was a fan of Windows Phone—and Scott’s thoughts on why it didn’t last (36:43) Lessons from working on (and fixing) Azure under Satya Nadella (42:50) Codeplex and the acquisition of GitHub (48:52) 2014: Three bold projects to win the hearts of developers (55:40) What Scott’s excited about in new developer tools and cloud computing (59:50) Why Scott thinks AI will enhance productivity but create more engineering jobs — The Pragmatic Engineer deepdives relevant for this episode: • Microsoft is dogfooding AI dev tools’ future • Microsoft’s developer tools roots • Why are Cloud Development Environments spiking in popularity, now? • Engineering career paths at Big Tech and scaleups • How Linux is built with Greg Kroah-Hartman — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Comparing Cloud AI Architectures from AWS, Azure and GCP

2025-06-03 · gartner-data-analytics-india-2025

talk

by Cuneyd Kaya (Gartner)

AI/ML AWS Cloud Computing GCP GenAI

CDAOs and AI leaders are grappling with two crucial questions: 1. What public cloud provider should we choose for AI and GenAI initiatives, and 2. how do we assemble the right cloud architecture to scale and deploy AI more effectively?
This session compares public cloud AI and Generative AI architectures from AWS, Azure and GCP and provides insights on their points of differentiation.

Migrate from AWS and Azure to Google Cloud runtimes

2025-04-11 · Google Cloud Next '25

session

by Vrinda Khurjekar (Searce) , Jatin Sharma (Google) , Bernhard Pfirrmann (Nokia) , Eitan Eibschutz (Google Cloud)

AWS Cloud Computing GCP LLM

Migrating from AWS or Azure to Google Cloud runtimes can feel like navigating a maze of complex services and dependencies. In this session, we’ll explore key considerations for migrating legacy applications, emphasizing the “why not modernize?” approach with a practice guide. We’ll share real-world examples of successful transformations. And we’ll go beyond theory with a live product demo that showcases migration tools, and a code assessment demo powered by Gemini that demonstrates how you can understand and modernize legacy code.

5 Different Shades of Copilot Agents: From SharePoint to Azure AI Search

2025-04-11 · Global AI Bootcamp {Berlin} | In-person

talk

by Ragnar Heil (HanseVision)

AI/ML

talk-data.com

Activity Trend

Top Events

Top Speakers

How to Build an Open Lakehouse: Best Practices for Interoperability

What’s New in Security and Compliance on the Databricks Data Intelligence Platform

Unlocking Access: Simplifying Identity Management at Scale With Databricks

Unified Advanced Analytics: Integrating Power BI and Databricks Genie for Real-time Insights

Enabling Sleep Science Research With Databricks and Delta Sharing

From Datavault to Delta Lake: Streamlining Data Sync with Lakeflow Connect

Lakeflow Declarative Pipelines Integrations and Interoperability: Get Data From — and to — Anywhere

Smart Vehicles, Secure Data: Recreating Vehicle Environments for Privacy-Preserving Machine Learning

Let's Save Tons of Money With Cloud-Native Data Ingestion!

Deploying Databricks Asset Bundles (DABs) at Scale

Italgas’ AI Factory and the Future of Gas Distribution

Sponsored by: Accenture & Avanade | Enterprise Data Journey for The Standard Insurance Leveraging Databricks on Azure and AI Innovation

Comprehensive Data Management and Governance With Azure Data Lake Storage

Sponsored by: Microsoft | Leverage the power of the Microsoft Ecosystem with Azure Databricks

Empowering Healthcare Insights: A Unified Lakehouse Approach With Databricks

Tackling Data Challenges for Scaling Multi-Agent GenAI Apps with Python

50 Years of Microsoft and Developer Tools with Scott Guthrie

Comparing Cloud AI Architectures from AWS, Azure and GCP

Migrate from AWS and Azure to Google Cloud runtimes

5 Different Shades of Copilot Agents: From SharePoint to Azure AI Search