RAG using Semantic Kernel with Azure OpenAI and Azure Cosmos DB for MongoDB vCore

2024-03-14 · Python Data Science Day

talk

Azure LLM MongoDB RAG

Data for the era of AI: Build intelligent apps with Azure Cosmos DB | BRK226HG

2023-11-20 · Microsoft Ignite 2023 Watch

video

by Mark Brown (Microsoft) , Estefani Arroyo , Maria Pallante , Kirill Gavrylyuk (Microsoft) , Robert Finlayson (KPMG) , Andrew Liu (Microsoft) , James Codella (Azure CosmosDB) , Marko Hotti , Rodrigo Souza , Anitha Adusumilli (Microsoft) , Jason Fogaty

AI/ML Azure GenAI HTML Microsoft RAG

From Chat GPT to the NBA to Mercedes-Benz, Azure Cosmos DB is enabling intelligent apps that change the way we live and work. Join us to learn from KPMG about how they built a generative AI-based assistant, and Bond Brand Loyalty on how they scale data to meet global customer demand, with Azure Cosmos DB. We'll explore capabilities like vector search and how to implement RAG pattern, along with improved elasticity, and greater scale.

To learn more, please check out these resources: * https://aka.ms/Ignite23CollectionsBRK226H * https://info.microsoft.com/ww-landing-contact-me-for-events-m365-in-person-events.html?LCID=en-us&ls=407628-contactme-formfill * https://aka.ms/azure-ignite2023-dataaiblog

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * James Codella * Kirill Gavrylyuk * Maria Pallante * Mark Brown * Robert Finlayson * Anitha Adusumilli * Estefani Arroyo * Andrew Liu * Marko Hotti * Rodrigo Souza * Jason Fogaty

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

BRK226HG | English (US) | Data

MSIgnite

Learn Live: Build an AI-enabled chat w/ Azure OpenAI & Azure Cosmos DB | BRK404LL

2023-11-16 · Microsoft Ignite 2023 Watch

video

by Bethany Jepchumba (Microsoft) , Julia Muiruri (Microsoft) , Akah Mandela Munab

AI/ML API Azure C#/.NET LLM Microsoft NoSQL

Connect an existing ASP.NET Core Blazor web application to Azure Cosmos DB for NoSQL and Azure OpenAI using their .NET SDKs. Your code manages and queries items in an API for NoSQL container. Your code also sends prompts to Azure OpenAI and parses the responses. This LIVE session is presented by two experts, and our moderators will answer your questions directly in the chat.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * B Jb * Julia Muiruri * Tim Fish * Akah Mandela Munab * DMITRII SOLOVEV * Jay Gordon * Julian Sharp * Konstantin Berezovsky * DE Producer 9 * Olivia Guzzardo

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

BRK404LL | English (US) | Data

MSIgnite

Processing Prescriptions at Scale at Walgreens

2023-07-26 · Databricks DATA + AI Summit 2023 Watch

video

by Daniel Zafar

API Data Engineering Data Lakehouse Databricks Microsoft Spark Data Streaming

We designed a scalable Spark Streaming job to manage 100s of millions of prescription-related operations per day at an end-to-end SLA of a few minutes and a lookup time of one second using CosmosDB.

In this session, we will share not only the architecture, but the challenges and solutions to using the Spark Cosmos connector at scale. We will discuss usages of the Aggregator API, custom implementations of the CosmosDB connector, and the major roadblocks we encountered with the solutions we engineered. In addition, we collaborated closely with Cosmos development team at Microsoft and will share the new features which resulted. If you ever plan to use Spark with Cosmos, you won't want to miss these gotchas!

Talk by: Daniel Zafar

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

A Single Pane of Glass on Airflow using Astro Python SDK, Snowflake, dbt, and Cosmos

2023-07-01 · Airflow Summit 2023

session

by Luan Moreno Medeiros Maciel (Pythian)

Airflow Astronomer dbt DWH ETL/ELT Python Snowflake SQL

ETL data pipelines are the bread and butter of data teams that must design, develop, and author DAGs to accommodate the various business requirements. dbt is becoming one of the most used tools to perform SQL transformations on the Data Warehouse, allowing teams to harness the power of queries at scale. Airflow users are constantly finding new ways to integrate dbt with the Airflow ecosystem and build a single pane of glass where Data Engineers can manage and administer their pipelines. Astronomer Cosmos, an open-source product, has been introduced to integrate Airflow with dbt Core seamlessly. Now you can easily see your dbt pipelines fully integrated on Airflow. You will learn the following: How to integrate dbt Core with Airflow How to use Cosmos How to build data pipelines at scale

Manifest destiny: Orchestrating dbt using Airflow

2023-07-01 · Airflow Summit 2023

session

by Jonathan Talmi (Snapcommerce)

Airflow dbt

Airflow is a popular choice for organizations looking to integrate open-source dbt within their existing data infrastructure. This talk will explore two primary methods of running dbt in Airflow: job-level and model-level. We’ll discuss the tradeoffs associated with each approach, highlighting the simplicity and efficiency of job-level orchestration, contrasted with the enhanced observability and control provided by model-level orchestration. We’ll also explain how the balance has shifted in recent years, with improvements to dbt core making model-level more efficient and innovative Airflow extensions like Cosmos making it easier to implement. Finally, we’ll provide benchmarks to help you determine which paradigm is the best fit for your needs.

Azure Databricks Cookbook

2021-09-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Phani Raj , Vinod Jaiswal

Analytics Azure Big Data CI/CD Databricks Delta Cyber Security Spark SQL Data Streaming Synapse apache-spark +2 more

Azure Databricks is a robust analytics platform that leverages Apache Spark and seamlessly integrates with Azure services. In the Azure Databricks Cookbook, you'll find hands-on recipes to ingest data, build modern data pipelines, and perform real-time analytics while learning to optimize and secure your solutions. What this Book will help me do Design advanced data workflows integrating Azure Synapse, Cosmos DB, and streaming sources with Databricks. Gain proficiency in using Delta Tables and Spark for efficient data storage and analysis. Learn to create, deploy, and manage real-time dashboards with Databricks SQL. Master CI/CD pipelines for automating deployments of Databricks solutions. Understand security best practices for restricting access and monitoring Azure Databricks. Author(s) None Raj and None Jaiswal are experienced professionals in the field of big data and analytics. They are well-versed in implementing Azure Databricks solutions for real-world problems. Their collaborative writing approach ensures clarity and practical focus. Who is it for? This book is tailored for data engineers, scientists, and big data professionals who want to apply Azure Databricks and Apache Spark to their analytics workflows. A basic familiarity with Spark and Azure is recommended to make the best use of the recipes provided. If you're looking to scale and optimize your analytics pipelines, this book is for you.

The Definitive Guide to Azure Data Engineering: Modern ELT, DevOps, and Analytics on the Azure Cloud Platform

2021-08-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ron C. L'Esteve

Analytics Azure ADF Azure DevOps CI/CD Cloud Computing Data Engineering Data Governance Data Lake Databricks DevOps ETL/ELT +9 more

Build efficient and scalable batch and real-time data ingestion pipelines, DevOps continuous integration and deployment pipelines, and advanced analytics solutions on the Azure Data Platform. This book teaches you to design and implement robust data engineering solutions using Data Factory, Databricks, Synapse Analytics, Snowflake, Azure SQL database, Stream Analytics, Cosmos database, and Data Lake Storage Gen2. You will learn how to engineer your use of these Azure Data Platform components for optimal performance and scalability. You will also learn to design self-service capabilities to maintain and drive the pipelines and your workloads. The approach in this book is to guide you through a hands-on, scenario-based learning process that will empower you to promote digital innovation best practices while you work through your organization’s projects, challenges, and needs. The clear examples enable you to use this book as a reference and guide for building data engineering solutions in Azure. After reading this book, you will have a far stronger skill set and confidence level in getting hands on with the Azure Data Platform. What You Will Learn Build dynamic, parameterized ELT data ingestion orchestration pipelines in Azure Data Factory Create data ingestion pipelines that integrate control tables for self-service ELT Implement a reusable logging framework that can be applied to multiple pipelines Integrate Azure Data Factory pipelines with a variety of Azure data sources and tools Transform data with Mapping Data Flows in Azure Data Factory Apply Azure DevOps continuous integration and deployment practices to your Azure Data Factory pipelines and development SQL databases Design and implement real-time streaming and advanced analytics solutions using Databricks, Stream Analytics, and Synapse Analytics Get started with a variety of Azure data services through hands-on examples Who This Book Is For Data engineers and data architects who are interested in learning architectural and engineering best practices around ELT and ETL on the Azure Data Platform, those who are creating complex Azure data engineering projects and are searching for patterns of success, and aspiring cloud and data professionals involved in data engineering, data governance, continuous integration and deployment of DevOps practices, and advanced analytics who want a full understanding of the many different tools and technologies that Azure Data Platform provides

Data Modeling for Azure Data Services

2021-07-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Peter ter Braake

Azure ADF BI Cloud Computing Data Lake Data Management Data Modelling Data Vault ETL/ELT dimensional modeling Microsoft NoSQL +6 more

Data Modeling for Azure Data Services is an essential guide that delves into the intricacies of designing, provisioning, and implementing robust data solutions within the Azure ecosystem. Through practical examples and hands-on exercises, this book equips you with the knowledge to create scalable, performant, and adaptable database designs tailored to your business needs. What this Book will help me do Understand and apply normalization, dimensional modeling, and data vault modeling for relational databases. Learn to provision and implement scalable solutions like Azure SQL DB and Azure Synapse SQL Pool. Master how to design and model a Data Lake using Azure Storage efficiently. Gain expertise in NoSQL database modeling and implementing solutions using Azure Cosmos DB. Develop ETL/ELT processes effectively using Azure Data Factory to support data integration workflows. Author(s) None Braake brings a wealth of expertise as a data architect and cloud solutions builder specializing in Azure's data services. With hands-on experience in projects requiring sophisticated data modeling and optimization, None crafts detailed learning material to help professionals level up their database design and Azure deployment skills. Dedicated to explaining complex topics with clarity and approachable language, None ensures that the learners gain not just knowledge but applied competence. Who is it for? This book is a valuable resource for business intelligence developers, data architects, and consultants aiming to refine their skills in data modeling within modern cloud ecosystems, particularly Microsoft Azure. Whether you're a beginner with some foundational cloud data management knowledge or an experienced professional seeking to deepen your Azure data services proficiency, this book caters to your learning needs.

Graph Databases in Action

2020-11-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Josh Perryman , Dave Bechberger

Agile/Scrum Data Modelling Microsoft Neo4j RDBMS data data-engineering graph-databases

Relationships in data often look far more like a web than an orderly set of rows and columns. Graph databases shine when it comes to revealing valuable insights within complex, interconnected data such as demographics, financial records, or computer networks. In Graph Databases in Action, experts Dave Bechberger and Josh Perryman illuminate the design and implementation of graph databases in real-world applications. You'll learn how to choose the right database solutions for your tasks, and how to use your new knowledge to build agile, flexible, and high-performing graph-powered applications! About the Technology Isolated data is a thing of the past! Now, data is connected, and graph databases—like Amazon Neptune, Microsoft Cosmos DB, and Neo4j—are the essential tools of this new reality. Graph databases represent relationships naturally, speeding the discovery of insights and driving business value. About the Book Graph Databases in Action introduces you to graph database concepts by comparing them with relational database constructs. You'll learn just enough theory to get started, then progress to hands-on development. Discover use cases involving social networking, recommendation engines, and personalization. What's Inside Graph databases vs. relational databases Systematic graph data modeling Querying and navigating a graph Graph patterns Pitfalls and antipatterns About the Reader For software developers. No experience with graph databases required. About the Authors Dave Bechberger and Josh Perryman have decades of experience building complex data-driven systems and have worked with graph databases since 2014. Quotes A comprehensive overview of graph databases and how to build them using Apache tools. - Richard Vaughan, Purple Monkey Collective A well-written and thorough introduction to the topic of graph databases. - Luis Moux, EMO A great guide in your journey towards graph databases and exploiting the new possibilities for data processing. - Mladen Knežić, CROZ A great introduction to graph databases and how you should approach designing systems that leverage graph databases. - Ron Sher, Intuit

Data Lake Analytics on Microsoft Azure: A Practitioner's Guide to Big Data Engineering

2020-10-08 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pankaj Khattar , Harsh Chawla

AI/ML Analytics Azure Big Data Data Analytics Data Engineering Data Lake Data Science Databricks DWH Hadoop IoT +7 more

Get a 360-degree view of how the journey of data analytics solutions has evolved from monolithic data stores and enterprise data warehouses to data lakes and modern data warehouses. You will This book includes comprehensive coverage of how: To architect data lake analytics solutions by choosing suitable technologies available on Microsoft Azure The advent of microservices applications covering ecommerce or modern solutions built on IoT and how real-time streaming data has completely disrupted this ecosystem These data analytics solutions have been transformed from solely understanding the trends from historical data to building predictions by infusing machine learning technologies into the solutions Data platform professionals who have been working on relational data stores, non-relational data stores, and big data technologies will find the content in this book useful. The book also can help you start your journey into the data engineer world as it provides an overview of advanced data analytics and touches on data science concepts and various artificial intelligence and machine learning technologies available on Microsoft Azure. What Will You Learn You will understand the: Concepts of data lake analytics, the modern data warehouse, and advanced data analytics Architecture patterns of the modern data warehouse and advanced data analytics solutions Phases—such as Data Ingestion, Store, Prep and Train, and Model and Serve—of data analytics solutions and technology choices available on Azure under each phase In-depth coverage of real-time and batch mode data analytics solutions architecture Various managed services available on Azure such as Synapse analytics, event hubs, Stream analytics, CosmosDB, and managed Hadoop services such as Databricks and HDInsight Who This Book Is For Data platform professionals, database architects, engineers, and solution architects

PolyBase Revealed: Data Virtualization with SQL Server, Hadoop, Apache Spark, and Beyond

2019-12-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Kevin Feasel

Azure ETL/ELT Hadoop Microsoft Oracle Spark SQL apache-spark data data-engineering

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered. PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and more. You will learn how PolyBase can help you reduce storage and other costs by avoiding the need for ETL processes that duplicate data in order to make it accessible from one source. PolyBase makes SQL Server into that one source, and T-SQL is your golden ticket. The book also covers PolyBase scale-out clusters, allowing you to distribute PolyBase queries among several SQL Server instances, thus improving performance. With great flexibility comes great complexity, and this book shows you where to look when queries fail, complete with coverageof internals, troubleshooting techniques, and where to find more information on obscure cross-platform errors. Data virtualization is a key target for Microsoft with SQL Server 2019. This book will help you keep your skills current, remain relevant, and build new business and career opportunities around Microsoft’s product direction. What You Will Learn Install and configure PolyBase as a stand-alone service, or unlock its capabilities with a scale-out cluster Understand how PolyBase interacts with outside data sources while presenting their data as regular SQL Server tables Write queries combining data from SQL Server, Apache Hadoop, Oracle, Cosmos DB, Apache Spark, and more Troubleshoot PolyBase queries using SQL Server Dynamic Management Views Tune PolyBase queries using statistics and execution plans Solve common business problems, including "cold storage" of infrequentlyaccessed data and simplifying ETL jobs Who This Book Is For SQL Server developers working in multi-platform environments who want one easy way of communicating with, and collecting data from, all of these sources

Cosmos DB for MongoDB Developers: Migrating to Azure Cosmos DB and Using the MongoDB API

2018-08-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Manish Sharma

API Azure MongoDB NoSQL data data-engineering nosql-databases

Learn Azure Cosmos DB and its MongoDB API with hands-on samples and advanced features such as the multi-homing API, geo-replication, custom indexing, TTL, request units (RU), consistency levels, partitioning, and much more. Each chapter explains Azure Cosmos DB’s features and functionalities by comparing it to MongoDB with coding samples. Cosmos DB for MongoDB Developers starts with an overview of NoSQL and Azure Cosmos DB and moves on to demonstrate the difference between geo-replication of Azure Cosmos DB compared to MongoDB. Along the way you’ll cover subjects including indexing, partitioning, consistency, and sizing, all of which will help you understand the concepts of read units and how this calculation is derived from an existing MongoDB’s usage. The next part of the book shows you the process and strategies for migrating to Azure Cosmos DB. You will learn the day-to-day scenarios of using Azure Cosmos DB, its sizing strategies, and optimizing techniques for the MongoDB API. This information will help you when planning to migrate from MongoDB or if you would like to compare MongoDB to the Azure Cosmos DB MongoDB API before considering the switch. What You Will Learn Migrate to MongoDB and understand its strategies Develop a sample application using MongoDB’s client driver Make use of sizing best practices and performance optimization scenarios Optimize MongoDB’s partition mechanism and indexing Who This Book Is For MongoDB developers who wish to learn Azure Cosmos DB. It specifically caters to a technical audience, working on MongoDB.

CosmosDB

2017-07-07 · Data Skeptic Listen

podcast_episode

by Dharma Shukla , Kyle Polich , Syam Nair

API Microsoft

This episode collects interviews from my recent trip to Microsoft Build where I had the opportunity to speak with Dharma Shukla and Syam Nair about the recently announced CosmosDB. CosmosDB is a globally consistent, distributed datastore that supports all the popular persistent storage formats (relational, key/value pair, document database, and graph) under a single streamlined API. The system provides tunable consistency, allowing the user to make choices about how consistency trade-offs are managed under the hood, if a consumer wants to go beyond the selected defaults.

The Hunt for Vulcan

2015-12-04 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Thomas Levenson (MIT)

Early astronomers could see several of the planets with the naked eye. The invention of the telescope allowed for further understanding of our solar system. The work of Isaac Newton allowed later scientists to accurately predict Neptune, which was later observationally confirmed exactly where predicted. It seemed only natural that a similar unknown body might explain anomalies in the orbit of Mercury, and thus began the search for the hypothesized planet Vulcan. Thomas Levenson's book "The Hunt for Vulcan" is a narrative of the key scientific minds involved in the search and eventual refutation of an unobserved planet between Mercury and the sun. Thomas joins me in this episode to discuss his book and the fascinating story of the quest to find this planet. During the discussion, we mention one of the contributions made by Urbain-Jean-Joseph Le Verrier which involved some complex calculations which enabled him to predict where to find the planet that would eventually be called Neptune. The calculus behind this work is difficult, and some of that work is demonstrated in a Jupyter notebook I recently discovered from Paulo Marques titled The-Body Problem. Thomas Levenson is a professor at MIT and head of its science writing program. He is the author of several books, including Einstein in Berlin and Newton and the Counterfeiter: The Unknown Detective Career of the Worldâ€™s Greatest Scientist. He has also made ten feature-length documentaries (including a two-hour Nova program on Einstein) for which he has won numerous awards. In his most recent book "The Hunt for Vulcan", explores the century spanning quest to explain the movement of the cosmos via theory and the role the hypothesized planet Vulcan played in the story. Follow Thomas on twitter @tomlevenson and check out his blog athttps://inversesquare.wordpress.com/. Pick up your copy of The Hunt for Vulcan at your local bookstore, preferred book buying place, or at the Penguin Random House site.

On-demand: Build and deploy AI agents with MCP and Azure Functions

· Microsoft Ignite 2025

talk

AI/ML Azure Microsoft

This lab provides a comprehensive overview of building, deploying, and securing AI agents. We will start by creating a single agent with Azure Functions and Foundry Agent Service in Microsoft Foundry, then enhance it with Model Context Protocol (MCP) and add data grounding with Azure Cosmos DB. You will also combine the single agents to form a multi-agent system. You will then deploy, manage and secure the application on a Flex Consumption plan.

On-demand: Multi-Agent Apps with Microsoft Agent Framework or LangGraph

· Microsoft Ignite 2025

talk

AI/ML Azure Microsoft Python

Build a multi-agent application leveraging MCP (Model Context Protocol) with the Microsoft Agent Framework in C# or LangGraph in Python, integrated with Azure Cosmos DB for scalable and high-performance data persistence and retrieval. Define agents, functions, and external service integrations, implement memory, state management, and semantic search using Azure Cosmos DB. By the end, you’ll have a robust AI agent system designed for real-world applications.

Preview: What’s new with Azure Databases and SQL Server

· Microsoft Ignite 2025 Watch

talk

by Shireesh Thota (Microsoft)

AI/ML Azure Fabric SQL

Unlock the next level of performance and scalability with Azure Databases—engineered for speed, reliability, and seamless integration across your stack. At Ignite, Shireesh Thota, CVP of Azure Databases, will unveil breakthrough features and releases for Azure Databases, SQL Server 2025, and Fabric Databases for SQL and Cosmos DB. Get an inside look at how these advancements empower you to build and deploy cutting-edge applications and AI agents—faster, smarter, and at enterprise scale.

talk-data.com

Cosmos

Activity Trend

Top Events

Top Speakers

RAG using Semantic Kernel with Azure OpenAI and Azure Cosmos DB for MongoDB vCore

Data for the era of AI: Build intelligent apps with Azure Cosmos DB | BRK226HG

MSIgnite

Learn Live: Build an AI-enabled chat w/ Azure OpenAI & Azure Cosmos DB | BRK404LL

MSIgnite

Processing Prescriptions at Scale at Walgreens

A Single Pane of Glass on Airflow using Astro Python SDK, Snowflake, dbt, and Cosmos

Manifest destiny: Orchestrating dbt using Airflow

Azure Databricks Cookbook

The Definitive Guide to Azure Data Engineering: Modern ELT, DevOps, and Analytics on the Azure Cloud Platform

Data Modeling for Azure Data Services

Graph Databases in Action

Data Lake Analytics on Microsoft Azure: A Practitioner's Guide to Big Data Engineering

PolyBase Revealed: Data Virtualization with SQL Server, Hadoop, Apache Spark, and Beyond

Cosmos DB for MongoDB Developers: Migrating to Azure Cosmos DB and Using the MongoDB API

CosmosDB

The Hunt for Vulcan

On-demand: Build and deploy AI agents with MCP and Azure Functions

On-demand: Multi-Agent Apps with Microsoft Agent Framework or LangGraph

Preview: What’s new with Azure Databases and SQL Server