AWS re:Invent 2025 - Architecting scalable and secure agentic AI with Bedrock AgentCore (AIM431)

2025-12-03 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Cloud Computing

Go deep into how AgentCore works under the hood. This technical deep-dive session breaks down the ReAct loop—how agents iteratively reason, plan, and perform tool calls to accomplish complex goals. Learn how context management, memory, and data grounding shape each reasoning step and response. Explore how AgentCore operationalizes this loop with modular services: Runtime for scalable execution, Gateway for dynamic tool and data access, Policy for deterministic controls, Observability for monitoring agent behavior, and Evaluations for continuous quality improvements. Understand how AgentCore’s architecture enables reliable, secure, and production-ready deployment of autonomous, data-driven AI agents.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Real-TIme Context Engineering for Agents

2025-11-07 · PyData Seattle 2025 Watch

talk

by Jim Dowling

API Python RAG

Agents need timely and relevant context data to work effectively in an interactive environment. If an agent takes more than a few seconds to react to an action in a client applicatoin, users will not perceive it as intelligent - just laggy.

Real-time context engineering involves building real-time data pipelines to pre-process application data and serve relevant and timely context to agents. This talk will focus on how you can leverage application identifiers (user ID, session ID, article ID, order ID, etc) to identify which real-time context data to provide to agents. We will contrast this approach with the more traditional RAG approach of using vector indexes to retrieve chunks of relevent text using the user query. Our approach will necessitate the introduction of the Agent-to-Agent protocol, an emerging standard for defining APIs for agents.

We will also demonstrate how we provide real-time context data from applications inside Python agents using the Hopsworks feature store. We will walk through an example of an interactive application (TikTok clone).

ActiveTigger: A Collaborative Text Annotation Research Tool for Computational Social Sciences

2025-09-30 · PyData Paris 2025 Watch

talk

by Emilien SCHULTZ , Paul Girard , Julien Boelaert

AI/ML API Computer Science GenAI GitHub LLM NLP Python

The exponential growth of textual data—ranging from social media posts and digital news archives to speech-to-text transcripts—has opened new frontiers for research in the social sciences. Tasks such as stance detection, topic classification, and information extraction have become increasingly common. At the same time, the rapid evolution of Natural Language Processing, especially pretrained language models and generative AI, has largely been led by the computer science community, often leaving a gap in accessibility for social scientists.

To address this, we initiated since 2023 the development of ActiveTigger, a lightweight, open-source Python application (with a web frontend in React) designed to accelerate annotation process and manage large-scale datasets through the integration of fine-tuned models. It aims to support computational social science for a large public both within and outside social sciences. Already used by a dynamic community in social sciences, the stable version is planned for early June 2025.

From a more technical prospect, the API is designed to manage the complete workflow from project creation, embeddings computation, exploration of the text corpus, human annotation with active learning, fine-tuning of pre-trained models (BERT-like), prediction on a larger corpus, and export. It also integrates LLM-as-a-service capabilities for prompt-based annotation and information extraction, offering a flexible approach for hybrid manual/automatic labeling. Accessible both with a web frontend and a Python client, ActiveTigger encourages customization and adaptation to specific research contexts and practices.

In this talk, we will delve into the motivations behind the creation of ActiveTigger, outline its technical architecture, and walk through its core functionalities. Drawing on several ongoing research projects within the Computational Social Science (CSS) group at CREST, we will illustrate concrete use cases where ActiveTigger has accelerated data annotation, enabled scalable workflows, and fostered collaborations. Beyond the technical demonstration, the talk will also open a broader reflection on the challenges and opportunities brought by generative AI in academic research—especially in terms of reliability, transparency, and methodological adaptation for qualitative and quantitative inquiries.

The repository of the project : https://github.com/emilienschultz/activetigger/

The development of this software is funded by the DRARI Ile-de-France and supported by Progédo.

Real-Time Context Engineering for LLMs

2025-09-26 · PyData Amsterdam 2025 Watch

talk

by Manu Joseph

AI/ML LLM Python

Context engineering has replaced prompt engineering as the main challenge in building agents and LLM applications. Context engineering involves providing LLMs with relevant and timely context data from various data sources, which allows them to make context-aware decisions. The context data provided to the LLM must be produced in real-time to enable it to react intelligently at human perceivable latencies (a second or two at most). If the application takes longer to react, humans would perceive it as laggy and unintelligent. In this talk, we will introduce context engineering and motivate for real-time context engineering for interactive applications. We will also demonstrate how to integrate real-time context data from applications inside Python agents using the Hopsworks feature store and corresponding application IDs. Application IDs are the key to unlock application context data for agents and LLMs. We will walk through an example of an interactive application (TikTok clone) that we make AI-enabled with Hopsworks.

Building an AI Agent for Natural Language to SQL Query Execution on Live Databases

2025-09-03 · PyData Berlin 2025 Watch

talk

by Cainã Max Couto da Silva

AI/ML RAG SQL

This hands-on tutorial will guide participants through building an end-to-end AI agent that translates natural language questions into SQL queries, validates and executes them on live databases, and returns accurate responses. Participants will build a system that intelligently routes between a specialized SQL agent and a ReAct chat agent, implementing RAG for query similarity matching, comprehensive safety validation, and human-in-the-loop confirmation. By the end of this session, attendees will have created a powerful and extensible system they can adapt to their own data sources.

Transforming Customer Processes and Gaining Productivity With Lakeflow Declarative Pipelines

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Marcos Abrantes Gomes (Bradesco Bank) , Ademir Francisquini Junior (Banco Bradesco S.A.)

CDP Databricks Marketing

Bradesco Bank is one of the largest private banks in Latin America, with over 75 million customers and over 80 years of presence in FSI. In the digital business, velocity to react to customer interactions is crucial to succeed. In the legacy landscape, acquiring data points on interactions over digital and marketing channels was complex, costly and lacking integrity due to typical fragmentation of tools. With the new in-house Customer Data Platform powered by Databricks Intelligent Platform, it was possible to completely transform the data strategy around customer data. Using some key components such Uniform and Lakeflow Declarative Pipelines, it was possible to increase data integrity, reduce latency and processing time and, most importantly, boost personal productivity and business agility. Months of reprocessing, weeks of human labor and cumbersome and complex data integrations were dramatically simplified achieving significant operational efficiency.

Metadata-Driven Streaming Ingestion Using Lakeflow Declarative Pipelines, Azure Event Hubs and a Schema Registry

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Vicky Avison (Plexure)

Azure Data Engineering Marketing Data Streaming

At Plexure, we ingest hundreds of millions of customer activities and transactions into our data platform every day, fuelling our personalisation engine and providing insights into the effectiveness of marketing campaigns.We're on a journey to transition from infrequent batch ingestion to near real-time streaming using Azure Event Hubs and Lakeflow Declarative Pipelines. This transformation will allow us to react to customer behaviour as it happens, rather than hours or even days later.It also enables us to move faster in other ways. By leveraging a Schema Registry, we've created a metadata-driven framework that allows data producers to: Evolve schemas with confidence, ensuring downstream processes continue running smoothly. Seamlessly publish new datasets into the data platform without requiring Data Engineering assistance. Join us to learn more about our journey and see how we're implementing this with Lakeflow Declarative Pipelines meta-programming - including a live demo of the end-to-end process!

Moody's AI Screening Agent: Automating Compliance Decisions

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Nishant Gurunath (Moody's)

AI/ML LLM RAG

The AI Screening Agent automates Level 1 (L1) screening process, essential for Know Your Customer (KYC) and compliance due diligence during customer onboarding. This system aims to minimize false positives, significantly reducing human review time and costs. Beyond typical Retrieval-Augmented Generation (RAG) applications like summarization and chat-with-your-data (CWYD), the AI Screening Agent employs a ReAct architecture with intelligent tools, enabling it to perform complex compliance decision-making with human-like accuracy and greater consistency. In this talk, I will explore the screening agent architecture, demonstrating its ability to meet evolving client policies. I will discuss evaluation and configuration management using MLflow LLM-as-judge and Unity Catalog, and discuss challenges, such as, data fidelity and customization. This session underscores the transformative potential of AI agents in compliance workflows, emphasizing their adaptability, accuracy, and consistency.

Lakeflow Connect: The Game-Changer for Complex Event-Driven Architectures

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Giancarlo Costa (European Food Safety Authority) , Jeroen De Clercq (delaware) , Tim Bal (delaware)

AI/ML Data Quality

In 2020, Delaware implemented a state-of-the-art, event-driven architecture for EFSA, enabling a highly decoupled system landscape, presented at the Data&AI Summit 2021. By centrally brokering events in near real-time, consumer applications react instantly to events from producer applications as they occur. Event producers are decoupled from consumers via a publisher/subscriber mechanism. Over the past years, we noticed some drawbacks. The processing of these custom events, primarily aimed for process integration weren’t covering all edge cases, the data quality was not always optimal due to missing events and we needed to create a complex logic for SCD2 tables. Lakeflow Connect allows us to extract the data directly from the source without the complex architecture in between, avoiding data loss and thus, data quality issues, and with some simple adjustments, an SCD2 table is created automatically. Lakeflow Connect allows us to create more efficient and intelligent data provisioning.

How We Made a Unified Talent Solution Using Databricks Machine Learning, Fine-Tuned LLM & Dolly 2.0

2023-07-26 · Databricks DATA + AI Summit 2023 Watch

video

by Nitu Nivedita

AI/ML Analytics Databricks DataOps Delta DevOps LLM Matplotlib NLP Plotly Power BI Data Streaming

Using Databricks, we built a “Unified Talent Solution” backed by a robust data and AI engine for analyzing skills of a combined pool of permanent employees, contractors, part-time employees and vendors, inferring skill gaps, future trends and recommended priority areas to bridge talent gaps, which ultimately greatly improved operational efficiency, transparency, commercial model, and talent experience of our client. We leveraged a variety of ML algorithms such as boosting, neural networks and NLP transformers to provide better AI-driven insights.

One inevitable part of developing these models within a typical DS workflow is iteration. Databricks' end-to-end ML/DS workflow service, MLflow, helped streamline this process by organizing them into experiments that tracked the data used for training/testing, model artifacts, lineage and the corresponding results/metrics. For checking the health of our models using drift detection, bias and explainability techniques, MLflow's deploying, and monitoring services were leveraged extensively.

Our solution built on Databricks platform, simplified ML by defining a data-centric workflow that unified best practices from DevOps, DataOps, and ModelOps. Databricks Feature Store allowed us to productionize our models and features jointly. Insights were done with visually appealing charts and graphs using PowerBI, plotly, matplotlib, that answer business questions most relevant to clients. We built our own advanced custom analytics platform on top of delta lake as Delta’s ACID guarantees allows us to build a real-time reporting app that displays consistent and reliable data - React (for front-end), Structured Streaming for ingesting data from Delta table with live query analytics on real time data ML predictions based on analytics data.

Talk by: Nitu Nivedita

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Introduction to Data Streaming on the Lakehouse

2023-07-25 · Databricks DATA + AI Summit 2023 Watch

video

by Zoe Durand , Yue Zhang

AI/ML Data Lakehouse Databricks ETL/ELT Data Streaming

Streaming is the future of all data pipelines and applications. It enables businesses to make data-driven decisions sooner and react faster, develop data-driven applications considered previously impossible, and deliver new and differentiated experiences to customers. However, many organizations have not realized the promise of streaming to its full potential because it requires them to completely redevelop their data pipelines and applications on new, complex, proprietary, and disjointed technology stacks.

The Databricks Lakehouse Platform is a simple, unified, and open platform that supports all streaming workloads ranging from ingestion, ETL to event processing, event-driven application, and ML inference. In this session, we will discuss the streaming capabilities of the Databricks Lakehouse Platform and demonstrate how easy it is to build end-to-end, scalable streaming pipelines and applications, to fulfill the promise of streaming for your business.

Talk by: Zoe Durand and Yue Zhang

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Realize the Promise of Streaming with the Databricks Lakehouse Platform

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

by Erica Lee (Upwork)

AI/ML Data Lakehouse Databricks ETL/ELT Data Streaming

Streaming is the future of all data pipelines and applications. It enables businesses to make data-driven decisions sooner and react faster, develop data-driven applications considered previously impossible, and deliver new and differentiated experiences to customers. However, many organizations have not realized the promise of streaming to its full potential because it requires them to completely redevelop their data pipelines and applications on new, complex, proprietary, and disjointed technology stacks.

The Databricks Lakehouse Platform is a simple, unified, and open platform that supports all streaming workloads ranging from ingestion, ETL to event processing, event-driven application, and ML inference. In this session, we will discuss the streaming capabilities of the Lakehouse Platform and demonstrate how easy it is to build end-to-end, scalable streaming pipelines and applications, to fulfill the promise of streaming for your business. You will also hear from Erica Lee, VP of ML at Upwork, the world's largest Work Marketplace, share how the Upwork team uses Databricks to enable real-time predictions by computing ML features in a continuous streaming manner.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Towards a Modular Future: Reimagining and Rebuilding Kedro-viz for Visualizing Modular Pipelines

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

AI/ML Data Science DataViz Databricks

Kedro is an open-source framework for creating portable pipelines through modular data science code, and provides a powerful interactive visualisation tool called ‘Kedro-Viz’, a webapp that magically generates a highly powerful and informational visualisation of the pipeline.

In 2020, the Kedro project introduced an important set of features to support Modular Pipelines, which allows users to set up a series of pipelines that are logically isolated and re-usable to form higher level pipelines.

With this paradigm shift comes the need to reimagine the visualization of the pipeline on Kedro-viz, in that it needs to introduce a series of redesigns and new features to support this new representation of pipeline structure.

As a core contributor and team member to the Kedro-viz project throughout the past year, I have witnessed the journey of this transition through shipping the core features for modular pipelines on Kedro-viz.

This talk will focus on my experience as a front end developer as I walk through the unique architecture and data ingestion setup for this project. I will deep-dive into the unique set of problems and assumptions we have to make in accommodating this new modular pipeline setup, and our approach for solving them within a Front End(React + Redux) context.

Not to say I will definitely share the mistakes and learnings along the way, and how this paved the path towards the app architecture choices for our next set of features in ML experiment tracking.

This talk is for the curious data practitioner who is up for exposure to a fresh set of problems beyond the typical data science domain, and for those who are up for a ride through the mind-boggling details of the unique set up of front end development and data visualisation for data science.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

talk-data.com

React

Activity Trend

Top Events

Top Speakers

AWS re:Invent 2025 - Architecting scalable and secure agentic AI with Bedrock AgentCore (AIM431)

AWSreInvent #AWSreInvent2025 #AWS

Real-TIme Context Engineering for Agents

ActiveTigger: A Collaborative Text Annotation Research Tool for Computational Social Sciences

Real-Time Context Engineering for LLMs

Building an AI Agent for Natural Language to SQL Query Execution on Live Databases

Transforming Customer Processes and Gaining Productivity With Lakeflow Declarative Pipelines

Metadata-Driven Streaming Ingestion Using Lakeflow Declarative Pipelines, Azure Event Hubs and a Schema Registry

Moody's AI Screening Agent: Automating Compliance Decisions

Lakeflow Connect: The Game-Changer for Complex Event-Driven Architectures

How We Made a Unified Talent Solution Using Databricks Machine Learning, Fine-Tuned LLM & Dolly 2.0

Introduction to Data Streaming on the Lakehouse

Realize the Promise of Streaming with the Databricks Lakehouse Platform

Towards a Modular Future: Reimagining and Rebuilding Kedro-viz for Visualizing Modular Pipelines