Context Engineering with DSPy

2026-12-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mike Taylor (Hopkins Architects)

AI/ML LLM ai-ml artificial-intelligence-ai data generative-ai

AI agents need the right context at the right time to do a good job. Too much input increases cost and harms accuracy, while too little causes instability and hallucinations. Context Engineering with DSPy introduces a practical, evaluation-driven way to design AI systems that remain reliable, predictable, and easy to maintain as they grow. AI engineer and educator Mike Taylor explains DSPy in a clear, approachable style, showing how its modular structure, portable programs, and built-in optimizers help teams move beyond guesswork. Through real examples and step-by-step guidance, you'll learn how DSPy's signatures, modules, datasets, and metrics work together to solve context engineering problems that evolve as models change and workloads scale. This book supports AI engineers, data scientists, machine learning practitioners, and software developers building AI agents, retrieval-augmented generation (RAG) systems, and multistep reasoning workflows that hold up in production. Understand the core ideas behind context engineering and why they matter Structure LLM pipelines with DSPy's maintainable, reusable components Apply evaluation-driven optimizers like GEPA and MIPROv2 for measurable improvements Create reproducible RAG and agentic workflows with clear metrics Develop AI systems that stay robust across providers, model updates, and real-world constraints

Generative AI on Microsoft Azure

2026-05-25 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Jorge Garcia Ximenez , Jaime De Mora , Adrian Gonzalez Sanchez (Microsoft)

AI/ML Azure Databricks GenAI GitHub Microsoft Snowflake ai-ml artificial-intelligence-ai data generative-ai

Companies are now moving generative AI projects from the lab to production environments. To support these increasingly sophisticated applications, they're turning to advanced practices such as multiagent architectures and complex code-based frameworks. This practical handbook shows you how to leverage cutting-edge techniques using Microsoft's powerful ecosystem of tools to deploy trustworthy AI systems tailored to your organization's needs. Written for and by AI professionals, Generative AI on Microsoft Azure goes beyond the technical core aspects, examining underlying principles, tools, and practices in depth, from the art of prompt engineering to strategies for fine-tuning models to advanced techniques like retrieval-augmented generation (RAG) and agentic AI. Through real-world case studies and insights from top experts, you'll learn how to harness AI's full potential on Azure, paving the way for groundbreaking solutions and sustainable success in today's AI-driven landscape. Understand the technical foundations of generative AI and how the technology has evolved over the last few years Implement advanced GenAI applications using Microsoft services like Azure AI Foundry, Copilot, GitHub Models, Azure Databricks, and Snowflake on Azure Leverage patterns, tools, frameworks, and platforms to customize AI projects Manage, govern, and secure your AI-enabled systems with responsible AI practices Build upon expert guidance to avoid common pitfalls, future-proof your applications, and more

Use These 7 Techniques to Improve Advanced RAG Systems and Deliver Business Value

2026-03-11 · gartner-data-analytics-us-2026

talk

by Kjell Carlsson (Gartner)

Organizations now have more options to build effective RAG systems, and those options come with confusion. Many organizations are looking to capitalize on new innovations such as long context windows, knowledge graphs, reasoning models, multi agent systems, and beyond. Attend this session to learn about seven challenges with RAG systems, their associated architectural choices, and best practices to improve their performance.

Ask the Expert: How to Design and Optimize RAG (Repeat)

2026-03-10 · gartner-data-analytics-us-2026

qa

by Sumit Agarwal (Gartner)

AI/ML

RAG has emerged as a powerful approach for building advanced AI systems that combine the strengths of large language models with external knowledge sources. However, RAG solutions struggle with reliability and require a lot of experimentation. This session will address key questions to help determine the best design pattern and optimization for RAG implementations.

Crossroads Debate: GenAI Build vs. Buy

2026-03-09 · gartner-data-analytics-us-2026

talk

by Sumit Agarwal (Gartner) , Kjell Carlsson (Gartner)

AI/ML GenAI LLM

GenAI solutions include several choices and trade-offs. A critical decision is: should you build custom AI solutions in-house or buy off-the-shelf products? This session brings together a debate on the trade-offs, risk and rewards of each approach. The session will be based on scenarios and use-cases to highlight key considerations such as cost, reliability , flexibility and speed for different decisions such as LLMs vs. SLMs, RAG vs. AI agents, packaged platform capability vs. bespoke custom solution, packaged vs. open-source.

Ask the Expert: How to Design and Optimize RAG

2026-03-09 · gartner-data-analytics-us-2026

qa

by Sumit Agarwal (Gartner)

AI/ML

RAG has emerged as a powerful approach for building advanced AI systems that combine the strengths of large language models with external knowledge sources. However, RAG solutions struggle with reliability and require a lot of experimentation. This session will address key questions to help determine the best design pattern and optimization for RAG implementations.

Memory, Context, and Retrieval-Augmented Generation (RAG)

2026-01-20 · Agentic AI Workshop | New York City

workshop

embeddings vector databases

Short-term vs. long-term memory in agents; How RAG extends memory through retrieval; Integrating vector databases and embeddings; Demo: augmenting agent responses with external knowledge.

Outclassing Frontier LLMs at Extracting Information

2025-12-22 · Outclassing Frontier LLMs at Extracting Information

talk

by Etienne Bernard (NuMind)

information extraction json output nuextract numarkdown ocr open-source llms

In this talk, the speaker presents NuExtract, the first LLM specialized in extracting structured information (JSON output), and NuMarkdown, the first reasoning OCR LLM (RAG-ready Markdown output). The talk demonstrates low-hallucination open-source models that outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller, enabling private usage. It will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction.

AWS re:Invent 2025 - Optimize agentic AI apps with semantic caching in Amazon ElastiCache (DAT451)

2025-12-15 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Cloud Computing

Multi-agent AI systems now orchestrate complex workflows requiring frequent foundation model calls. In this session, learn how you can reduce latencies to single-digit milliseconds from single-digit seconds with vector search for Amazon ElastiCache for Valkey in agentic AI applications using semantic caching, while also reducing the cost incurred from your foundation models for production workloads. By implementing semantic caching in agentic architectures like RAG-powered assistants and autonomous agents, customers can create performant and cost-effective production-scale agentic AI systems.

Learn More: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Building a Data and AI Platform with PostgreSQL

2025-12-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jozef de Vries , Benjamin Anderson , Tom Taulli (Forbes)

AI/ML Data Management data data-engineering postgresql relational-databases

In a world where data sovereignty, scalability, and AI innovation are at the forefront of enterprise strategy, PostgreSQL is emerging as the key to unlocking transformative business value. This new guide serves as your beacon for navigating the convergence of AI, open source technologies, and intelligent data platforms. Authors Tom Taulli, Benjamin Anderson, and Jozef de Vries offer a strategic and practical approach to building AI and data platforms that balance innovation with governance, empowering organizations to take control of their data future. Whether you're designing frameworks for advanced AI applications, modernizing legacy infrastructures, or solving data challenges at scale, you can use this guide to bridge the gap between technical complexity and actionable strategy. Written for IT executives, data leaders, and practitioners alike, it will equip you with the tools and insights to harness Postgre's unique capabilities—extensibility, unstructured data management, and hybrid workloads—for long-term success in an AI-driven world. Learn how to build an AI and data platform using PostgreSQL Overcome data challenges like modernization, integration, and governance Optimize AI performance with model fine-tuning and retrieval-augmented generation (RAG) best practices Discover use cases that align data strategy with business goals Take charge of your data and AI future with this comprehensive and accessible roadmap

Navigating the Web Summit 2025 with AI Agents and RAG

2025-12-11 · Global Agentic Night Berlin: Exploring the Future of AI Agents

talk

by Onur Eken (Needl)

ai agents

Navigating the Web Summit 2025 with AI Agents and RAG

2025-12-11 · Global Agentic Night Berlin: Exploring the Future of AI Agents

talk

by Onur Eken (Needl)

ai agents

Talk by Onur Eken, Co-founder at Needl; discusses using AI agents and retrieval augmented generation (RAG) to navigate large tech events.

No Cloud? No Problem. Local RAG with Embedding Gemma

2025-12-10 · PyData Boston 2025 Watch

talk

by Sanjit Paliwal

API Cloud Computing

Running Retrieval-Augmented Generation (RAG) pipelines often feels tied to expensive cloud APIs or large GPU clusters—but it doesn’t have to be. This session explores how Embedding Gemma, Google’s lightweight open embedding model, enables powerful RAG and text classification workflows entirely on a local machine. Using the Sentence Transformers framework with Hugging Face, high-quality embeddings can be generated efficiently for retrieval and classification tasks. Real-world examples involving call transcripts and agent remark classification illustrate how robust results can be achieved without the cloud—or the budget.

Building Production RAG Systems for Health Care Domains : Clinical Decision

2025-12-10 · PyData Boston 2025 Watch

talk

by Nikunj Doshi , Shikhar Patel

Building on but moving far beyond the single-specialty focus of HandRAG, this session examines how Retrieval-Augmented Generation can be engineered to support clinical reasoning across multiple high stakes surgical areas, including orthopedic, cardiovascular, neurosurgical, and plastic surgery domains. Using a corpus of more than 7,800 clinical publications and cross specialty validation studies, the talk highlights practical methods for structuring heterogeneous medical data, optimizing vector retrieval with up to 35% latency gains, and designing prompts that preserve terminology accuracy across diverse subspecialties. Attendees will also learn a three-tier evaluation framework that improved critical-error detection by 2.4×, as well as deployment strategy such as automated literature refresh pipelines and cost-efficient architectures that reduced inference spending by 60% that enable RAG systems to operate reliably in real production healthcare settings.

The Boringly Simple Loop Powering GenAI Apps

2025-12-09 · PyData Boston 2025 Watch

talk

by Sebastian Wallkötter

AI/ML GenAI

Do you feel lost in the jungle of GenAI frameworks and buzzwords? Here's a way out. Take any GenAI app, peel away the fluff, and look at its core. You'll find the same pattern: a boringly simple nested while loop. I will show you how this loop produces chat assistants, AI agents, and multi-agent systems. Then we'll cover how RAG, tool-calling, and memory are like lego bricks we add as needed. This gives you a first-principles based map. Use it to build GenAI apps from scratch; no frameworks needed.

Keynote by Lisa Amini- What’s Next in AI for Data and Data Management?

2025-12-09 · PyData Boston 2025 Watch

talk

AI/ML Data Management Dataflow LLM

Advances in large language models (LLMs) have propelled a recent flurry of AI tools for data management and operations. For example, AI-powered code assistants leverage LLMs to generate code for dataflow pipelines. RAG pipelines enable LLMs to ground responses with relevant information from external data sources. Data agents leverage LLMs to turn natural language questions into data-driven answers and actions. While challenges remain, these advances are opening exciting new opportunities for data scientists and engineers. In this talk, we will examine recent advances, along with some still incubating in research labs, with the goal of understanding where this is all heading, and present our perspective on what’s next for AI in data management and data operations.

Where Have All the Metrics Gone?

2025-12-09 · PyData Boston 2025 Watch

talk

by Dr. Rebecca Bilbro

How exactly does one validate the factuality of answers from a Retrieval-Augmented Generation (RAG) system? Or measure the impact of the new system prompt for your customer service agent? What do you do when stakeholders keep asking for "accuracy" metrics that you simply don't have? In this talk, we’ll learn how to define (and measure) what “good” looks like when traditional model metrics don’t apply.

Scaling Python to thousands of nodes with Ray

2025-12-09 · PyData Eindhoven 2025 Watch

talk

by Rob de Wit-Liezenga

AI/ML Python

Python is the language of choice for anything to do with AI and ML. While that has made it easy to write code for one machine, it's much more difficult to run workloads across clusters of thousands of nodes. Ray allows you to do just that. I'll demonstrate how to implement this open source tool with a few lines of code. As a demo project, I'll show how I built a RAG for the Wheel of Time series.

Building Agentic AI: Workflows, Fine-Tuning, Optimization, and Deployment

2025-12-08 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Sinan Ozdemir (LoopGenius)

AI/ML LLM Vector DB ai-ml artificial-intelligence-ai data generative-ai

Transform Your Business with Intelligent AI to Drive Outcomes Building reactive AI applications and chatbots is no longer enough. The competitive advantage belongs to those who can build AI that can respond, reason, plan, and execute. Building Agentic AI: Workflows, Fine-Tuning, Optimization, and Deployment takes you beyond basic chatbots to create fully functional, autonomous agents that automate real workflows, enhance human decision-making, and drive measurable business outcomes across high-impact domains like customer support, finance, and research. Whether you're a developer deploying your first model, a data scientist exploring multi-agent systems and distilled LLMs, or a product manager integrating AI workflows and embedding models, this practical handbook provides tried and tested blueprints for building production-ready systems. Harness the power of reasoning models for applications like computer use, multimodal systems to work with all kinds of data, and fine-tuning techniques to get the most out of AI. Learn to test, monitor, and optimize agentic systems to keep them reliable and cost-effective at enterprise scale. Master the complete agentic AI pipeline Design adaptive AI agents with memory, tool use, and collaborative reasoning capabilities Build robust RAG workflows using embeddings, vector databases, and LangGraph state management Implement comprehensive evaluation frameworks beyond accuracy, including precision, recall, and latency metrics Deploy multimodal AI systems that seamlessly integrate text, vision, audio, and code generation Optimize models for production through fine-tuning, quantization, and speculative decoding techniques Navigate the bleeding edge of reasoning LLMs and computer-use capabilities Balance cost, speed, accuracy, and privacy in real-world deployment scenarios Create hybrid architectures that combine multiple agents for complex enterprise applications Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

AWS re:Invent 2025 - A practitioner’s guide to data for agentic AI (DAT315)

2025-12-05 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Aurora Cloud Computing Data Lake Data Management Data Quality Amazon SageMaker Data Streaming

In this session, gain the skills needed to deploy end-to-end agentic AI applications using your most valuable data. This session focuses on data management using processes like Model Context Protocol (MCP) and Retrieval Augmented Generation (RAG), and provides concepts that apply to other methods of customizing agentic AI applications. Discover best practice architectures using AWS database services like Amazon Aurora and OpenSearch Service, along with analytical, data processing and streaming experiences found in SageMaker Unified Studio. Learn data lake, governance, and data quality concepts and how Amazon Bedrock AgentCore and Bedrock Knowledge Bases, and other features tie solution components together.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

talk-data.com

RAG

Activity Trend

Top Events

Top Speakers

Context Engineering with DSPy

Generative AI on Microsoft Azure

Use These 7 Techniques to Improve Advanced RAG Systems and Deliver Business Value

Ask the Expert: How to Design and Optimize RAG (Repeat)

Crossroads Debate: GenAI Build vs. Buy

Ask the Expert: How to Design and Optimize RAG

Memory, Context, and Retrieval-Augmented Generation (RAG)

Outclassing Frontier LLMs at Extracting Information

AWS re:Invent 2025 - Optimize agentic AI apps with semantic caching in Amazon ElastiCache (DAT451)

AWSreInvent #AWSreInvent2025 #AWS

Building a Data and AI Platform with PostgreSQL

Navigating the Web Summit 2025 with AI Agents and RAG

Navigating the Web Summit 2025 with AI Agents and RAG

No Cloud? No Problem. Local RAG with Embedding Gemma

Building Production RAG Systems for Health Care Domains : Clinical Decision

The Boringly Simple Loop Powering GenAI Apps

Keynote by Lisa Amini- What’s Next in AI for Data and Data Management?

Where Have All the Metrics Gone?

Scaling Python to thousands of nodes with Ray

Building Agentic AI: Workflows, Fine-Tuning, Optimization, and Deployment

AWS re:Invent 2025 - A practitioner’s guide to data for agentic AI (DAT315)

AWSreInvent #AWSreInvent2025 #AWS