Open models are essential for accelerating AI innovation, providing transparency, flexibility, and adaptability across enterprise use cases. They empower organizations to fine-tune, customize, and deploy AI solutions securely while retaining control over data. Discover how NVIDIA AI Foundation models, available on Azure platforms as NVIDIA NIM, power technologies and use cases. Learn how NVIDIA Nemotron, open reasoning models, are used to build powerful enterprise research agents that synthesize hours of research in minutes. Also, explore how NVIDIA Cosmos world foundation models are used to simulate, reason, and generate data for downstream pipelines in robotics, autonomous vehicles, and industrial vision systems.
talk-data.com
Topic
Cosmos
Azure Cosmos DB
58
tagged
Activity Trend
Top Events
Want to ship applications faster? This demo shows you how to dramatically accelerate time to market with a high-velocity vibe coding workflow. We'll build a complete app with Azure Cosmos DB from scratch using GitHub Copilot as our AI pair programmer. We'll make use of the new Azure Cosmos DB Linux emulator and VS Code extension for local development to create a frictionless, end-to-end experience that helps you deliver features faster than ever before.
The Foundry Model Catalog supports “One-click” deployment of the latest NVIDIA NIM™ Examples of usage of the newest NVIDIA entries • Nemotron Nano: Large Language Reasoning Model • Nemotron Nano VLM: VLM with the ability to query and summarize images and video • Cosmos-reason1: Reasoning VLM for physical AI and robotics • Microsoft-Trellis: Asset generation model capable of producing detailed meshes, • Boltz2: Structural biology model for structure and affinity • Evo2 NIM: Biological model that integrates information over long genomic sequences
Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.
As teams scale their Airflow workflows, a common question is: “My DAG has 5,000 tasks—how long will it take to run in Airflow?” Beyond execution time, users often face challenges with dynamically generated DAGs, such as: Delayed visualization in the Airflow UI after deployment. High resource consumption, leading to Kubernetes pod evictions and out-of-memory errors. While estimating the resource utilization in a distributed data platform is complex, benchmarking can provide crucial insights. In this talk, we’ll share our approach to benchmarking dynamically generated DAGs with Astronomer Cosmos ( https://github.com/astronomer/astronomer-cosmos) , covering: Designing representative and extensible baseline tests. Setting up an isolated, distributed infrastructure for benchmarking. Running reproducible performance tests. Measuring DAG run times and task throughput. Evaluating CPU & memory consumption to optimize deployments. By the end of this session, you will have practical benchmarks and strategies for making informed decisions about evaluating the performance of DAGs in Airflow.
Efficiently handling long-running workflows is crucial for scaling modern data pipelines. Apache Airflow’s deferrable operators help offload tasks during idle periods — freeing worker slots while tracking progress. This session explores how Cosmos 1.9 ( https://github.com/astronomer/astronomer-cosmos ) integrates Airflow’s deferrable capabilities to enhance orchestrating dbt ( https://github.com/dbt-labs/dbt-core ) in production, with insights from recent contributions that introduced this functionality. Key takeaways: Deferrable Operators: How they work and why they’re ideal for long-running dbt tasks. Integrating with Cosmos: Refactoring and enhancements to enable deferrable behaviour across platforms. Performance Gains: Resource savings and task throughput improvements from deferrable execution. Challenges & Future Enhancements: Lessons learned, compatibility, and ideas for broader support. Whether orchestrating dbt models on a cloud warehouse or managing large-scale transformations, this session offers practical strategies to reduce resource contention and boost pipeline performance.
This talk explores EDB’s journey from siloed reporting to a unified data platform, powered by Airflow. We’ll delve into the architectural evolution, showcasing how Airflow orchestrates a diverse range of use cases, from Analytics Engineering to complex MLOps pipelines. Learn how EDB leverages Airflow and Cosmos to integrate dbt for robust data transformations, ensuring data quality and consistency. We’ll provide a detailed case study of our MLOps implementation, demonstrating how Airflow manages training, inference, and model monitoring pipelines for Azure Machine Learning models. Discover the design considerations driven by our internal data governance framework and gain insights into our future plans for AIOps integration with Airflow.
As a popular open-source library for analytics engineering, dbt is often combined with Airflow. Orchestrating and executing dbt models as DAGs ensures an additional layer of control over tasks, observability, and provides a reliable, scalable environment to run dbt models. This workshop will cover a step-by-step guide to Cosmos , a popular open-source package from Astronomer that helps you quickly run your dbt Core projects as Airflow DAGs and Task Groups, all with just a few lines of code. We’ll walk through: Running and visualising your dbt transformations Managing dependency conflicts Defining database credentials (profiles) Configuring source and test nodes Using dbt selectors Customising arguments per model Addressing performance challenges Leveraging deferrable operators Visualising dbt docs in the Airflow UI Example of how to deploy to production Troubleshooting We encourage participants to bring their dbt project to follow this step-by-step workshop.
This session is repeated.In this session, you will learn how to integrate Lakeflow Declarative Pipelines with external systems in order to ingest and send data virtually anywhere. Lakeflow Declarative Pipelines is most often used in ingestion and ETL into the Lakehouse. New Lakeflow Declarative Pipelines capabilities like the Lakeflow Declarative Pipelines Sinks API and added support for Python Data Source and ForEachBatch have opened up Lakeflow Declarative Pipelines to support almost any integration. This includes popular Apache Spark™ integrations like JDBC, Kafka, External and managed Delta tables, Azure CosmosDB, MongoDB and more.
The use of multiple Large Language Models (LLMs) working together perform complex tasks, known as multi-agent systems, has gained significant traction. While orchestration frameworks like LangGraph and Semantic Kernel can streamline orchestration and coordination among agents, developing large-scale, production-grade systems can bring a host of data challenges. Issues such as supporting multi-tenancy, preserving transactional integrity and state, and managing reliable asynchronous function calls while scaling efficiently can be difficult to navigate.
Leveraging insights from practical experiences in the Azure Cosmos DB engineering team, this talk will guide you through key considerations and best practices for storing, managing, and leveraging data in multi-agent applications at any scale. You’ll learn how to understand core multi-agent concepts and architectures, manage statefulness and conversation histories, personalize agents through retrieval-augmented generation (RAG), and effectively integrate APIs and function calls.
Aimed at developers, architects, and data scientists at all skill levels, this session will show you how to take your multi-agent systems from the lab to full-scale production deployments, ready to solve real-world problems. We’ll also walk through code implementations that can be quickly and easily put into practice, all in Python.
Microsoft is making significant investments in relational and NoSQL open-source databases. Learn about Azure Database for PostgreSQL, Azure Database for MySQL, Azure Cosmos DB for MongoDB, with new enterprise-ready features to support daily business operations. See new migration capabilities as well as Microsoft's new contributions to the open-source community, including DiskANN, a PostgreSQL extension for Azure OpenAI Service, and much more.
𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Aditi Gupta * AVIJIT GUPTA * Dingding Lu * Prasanth Tammiraju
𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com
BRK207 | English (US) | Data
MSIgnite
Are you ready to build high-performance, cost-efficient apps? Join us to master Azure Cosmos DB! This session covers serverless vs. autoscale options, data modeling, query optimization, and advanced features for top performance. Learn how to quickly build apps that can easily grow to planetary scale with data partitioning and fine-tuning strategies. Join us to hear about best practices and insider tips to design cloud-native apps with unmatched performance, scalability, and cost savings!
𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Estefani Arroyo * Tara Bhatia
𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com
BRK194 | English (US) | Data
MSIgnite
In this session, we’ll explore advanced strategies for scalable serverless RAG applications using Azure Cosmos DB and DiskANN. Topics include architecture simplification, data synchronization, scalability challenges, and multi-tenancy. We’ll discuss the latest developments in vector search, full-text search, and hybrid search in Azure Cosmos DB, showcasing real-world use cases from our customers like EY. Learn how to get started easily with LangChain and Semantic Kernel.
𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Bhaskar Bhatt * James Codella * Zhe Li
𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com
BRK193 | English (US) | Data
MSIgnite
Discover the power of GenAI in Toyota’s powertrain development. Learn how Toyota improves information collection, enhances decision-making, and boosts productivity for powertrain engineers with a multi-agent system built with Azure AI Foundry and Azure Cosmos DB. Join us to get a glimpse into the future of agent technology and data management with Azure, and hear from Toyota first-hand how they are using AI to innovate vehicle design.
𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Mark Brown * Marco Casalaina * Kosuke Miyasaka * Kenji Onishi
𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com
BRK117 | English (US) | AI
MSIgnite
An intro to RAGHack, a global hackathon to develop apps using LLMs and RAG. A large language model (LLM) like GPT-4 can be used for summarization, translation, entity extraction, and question-answering. Retrieval Augmented Generation (RAG) is an approach that sends context to the LLM so that it can provide grounded answers. RAG apps can be developed on Azure using a wide range of programming languages and retrievers (such as AI Search, Cosmos DB, PostgreSQL, and Azure SQL). Get an overview of RAG in this session before diving deep in our follow-up streams.
Una introducción a RAGHack, un hackathon global para desarrollar aplicaciones utilizando modelos de lenguaje grandes (LLM) y RAG. Se presentarán conceptos de RAG y se demostrará una aplicación RAG de muestra que se puede empezar a usar hoy.
Airflow is often used for running data pipelines, which themselves connect with other services through the provider system. However, it is also increasingly used as an engine under-the-hood for other projects building on top of the DAG primitive. For example, Cosmos is a framework for automatically transforming dbt DAGs into Airflow DAGs, so that users can supplement the developer experience of dbt with the power of Airflow. This session dives into how a select group of these frameworks (Cosmos, Meltano, Chronon) use Airflow as an engine for orchestrating complex workflows their systems depend on. In particular, we will discuss ways that we’ve increased Airflow performance to meet application-specific demands (high-task-count Cosmos DAGs, streaming jobs in Chronon), new Airflow features that will evolve how these frameworks use Airflow under the hood (DAG versioning, dataset integrations), and paths we see these projects taking over the next few years as Airflow grows. Airflow is not just a DAG platform, it’s an application platform!
Balyasny Asset Management (BAM) is a diversified global investment firm founded in 2001 with over $20 billion in assets under management. As dbt took hold at BAM, we had multiple teams building dbt projects against Snowflake, Redshift, and SQL Server. The common question was: How can we quickly and easily productionise our projects? Airflow is the orchestrator of choice at BAM, but our dbt users ranged from Airflow power users to people who’d never heard of Airflow before. We built a single solution on top of Cosmos that allowed us to: Decouple the dbt project from the Airflow repository Have each dbt node run as a separate Airflow task Allow users to run dbt with little to no Airflow knowledge Enable users to have fine-grained control over how dbt is run and to combine it with other Airflow tasks Provide observability, monitoring, and alerting.
The integration between dbt and Airflow is a popular topic in the community, both in previous editions of Airflow Summit, in Coalesce and the #airflow-dbt Slack channel. Astronomer Cosmos ( https://github.com/astronomer/astronomer-cosmos/ ) stands out as one of the libraries that strives to enhance this integration, having over 300k downloads per month. During its development, we’ve encountered various performance challenges in terms of scheduling and task execution. While we’ve managed to address some, others remain to be resolved. This talk describes how Cosmos works, the improvements made over the last 1.5 years, and the roadmap. It also aims to collect feedback from the community on how we can further improve the experience of running dbt in Airflow.