talk-data.com talk-data.com

Event

PyData Amsterdam 2025

2025-09-24 – 2025-09-26 PyData

Activities tracked

5

Filtering by: RAG ×

Sessions & talks

Showing 1–5 of 5 · Newest first

Search within this event →
Searching for My Next Chart

Searching for My Next Chart

2025-09-26 Watch
talk

Abstract

As a data visualization practitioner, I frequently draw inspiration from the diverse and rapidly expanding community, particularly through challenges like #TidyTuesday. However, the sheer volume of remarkable visualizations quickly overwhelmed my manual curation methods—from Pinterest boards to Notion pages. This created a significant bottleneck in my workflow, as I found myself spending more time cataloging charts than actively creating them.

In this talk, I will present a RAG (Retrieval Augmented Generation) based retrieval system that I designed specifically for data visualizations. I will detail the methodology behind this system, illustrating how I addressed my own workflow inefficiencies by transforming a dispersed collection of charts into a semantically searchable knowledge base. This project serves as a practical example of applying advanced AI techniques to enhance creative technical work, demonstrating how a specialized retrieval system can significantly improve the efficiency and quality of data visualization creation process.

Measure twice, deploy once: Evaluation of retrieval systems

2025-09-25
talk
RAG

Improving retrieval systems—especially in RAG pipelines—requires a clear understanding of what’s working and what isn’t. The only scalable way to do that is through meaningful metrics. In this talk, we share insights from building a platform-agnostic search and retrieval product, and how we balance performance against cost. Bigger models often give better results… but at what price? We explain how to assess what’s “good enough” and why the choice of benchmark really matters.

Context is King: Evaluating Long Context vs. RAG for Data Grounding

2025-09-25
talk

Grounding Large Language Models in your specific data is crucial, but notoriously challenging. Retrieval-Augmented Generation (RAG) is the common pattern, yet practical implementations are often brittle, suffering from poor retrieval, ineffective chunking, and context limitations, leading to inaccurate or irrelevant answers. The emergence of massive context windows (1M+ tokens) seems to offer a simpler path – just put all your data in the prompt! But does it truly solve the "needle in a haystack" problem, or introduce new challenges like prohibitive costs and information getting lost in the middle? This talk dives deep into the engineering realities. We'll dissect common RAG failure modes, explore techniques for building robust RAG systems (advanced retrieval, re-ranking, query transformations), and critically evaluate the practical viability, costs, and limitations of leveraging long context windows for complex data tasks in Python. Leave understanding the real trade-offs to make informed architectural decisions for building reliable, data-grounded GenAI applications.

Grounding LLMs on Solid Knowledge: Assessing and Improving Knowledge Graph Quality in GraphRAG Applications

2025-09-24
talk

Graph-based Retrieval-Augmented Generation (GraphRAG) enhances large language models (LLMs) by grounding their responses in structured knowledge graphs, offering more accurate, domain-specific, and explainable outputs. However, many of the graphs used in these pipelines are automatically generated or loosely assembled, and often lack the semantic structure, consistency, and clarity required for reliable grounding. The result is misleading retrieval, vague or incomplete answers, and hallucinations that are difficult to trace or fix.

This hands-on tutorial introduces a practical approach to evaluating and improving knowledge graph quality in GraphRAG applications. We’ll explore common failure patterns, walk through real-world examples, and share a reusable checklist of features that make a graph “AI-ready.” Participants will learn methods for identifying gaps, inconsistencies, and modeling issues that prevent knowledge graphs from effectively supporting LLMs, and apply simple fixes to improve grounding and retrieval performance in their own projects.

Next-Level Retrieval in RAG: Techniques and Tools for Enhanced Performance

2025-09-24
talk
RAG

Retrieval-Augmented Generation (RAG) systems rely heavily on the quality of the retrieval process to generate accurate and contextually relevant outputs. In this 90-minute tutorial, we explore practical techniques to enhance retrieval across three key stages: pre-retrieval, mid-retrieval, and post-retrieval. Participants will learn how to optimize data preparation, query strategies, reranking, and evaluation to significantly improve the performance of RAG systems. A real-world case study will guide attendees through implementing these methods in a complete retrieval workflow.