talk-data.com talk-data.com

Event

PyData Berlin 2025

2025-09-01 – 2025-09-03 PyData

Activities tracked

10

Filtering by: LLM ×

Sessions & talks

Showing 1–10 of 10 · Newest first

Search within this event →
From Manual to LLMs: Scaling Product Categorization

From Manual to LLMs: Scaling Product Categorization

2025-09-02 Watch
talk

How to use LLMs to categorize hundreds of thousands of products into 1,000 categories at scale? Learn about our journey from manual/rule-based methods, via fine-tuned semantic models, to a robust multi-step process which uses embeddings and LLMs via the OpenAI APIs. This talk offers data scientists and AI practitioners learnings and best practices for putting such a complex LLM-based system into production. This includes prompt development, balancing cost vs. accuracy via model selection, testing mult-case vs. single-case prompts, and saving costs by using the OpenAI Batch API and a smart early-stopping approach. We also describe our automation and monitoring in a PySpark environment.

Navigating healthcare scientific knowledge:building AI agents for accurate biomedical data retrieval

Navigating healthcare scientific knowledge:building AI agents for accurate biomedical data retrieval

2025-09-02 Watch
talk

With a focus on healthcare applications where accuracy is non negotiable, this talk highlights challenges and delivers practical insights on building AI agents which query complex biological and scientific data to answer sophisticated questions. Drawing from our experience developing Owkin-K Navigator, a free-to-use AI co-pilot for biological research, I'll share hard-won lessons about combining natural language processing with SQL querying and vector database retrieval to navigate large biomedical knowledge sources, addressing challenges of preventing hallucinations and ensuring proper source attribution. This session is ideal for data scientists, ML engineers, and anyone interested in applying python and LLM ecosystem to the healthcare domain.

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems

Beyond Benchmarks: Practical Evaluation Strategies for Compound AI Systems

2025-09-02 Watch
talk

Evaluating large language models (LLMs) in real-world applications goes far beyond standard benchmarks. When LLMs are embedded in complex pipelines, choosing the right models, prompts, and parameters becomes an ongoing challenge.

In this talk, we will present a practical, human-in-the-loop evaluation framework that enables systematic improvement of LLM-powered systems based on expert feedback. By combining domain expert insights and automated evaluation methods, it is possible to iteratively refine these systems while building transparency and trust.

This talk will be valuable for anyone who wants to ensure their LLM applications can handle real-world complexity - not just perform well on generic benchmarks.

How We Automate Chaos: Agentic AI and Community Ops at PyCon DE & PyData

How We Automate Chaos: Agentic AI and Community Ops at PyCon DE & PyData

2025-09-02 Watch
talk

Using AI agents and automation, PyCon DE & PyData volunteers have transformed chaos into streamlined conference ops. From YAML files to LLM-powered assistants, they automate speaker logistics, FAQs, video processing, and more while keeping humans focused on creativity. This case study reveals practical lessons on making AI work in real-world scenarios: structured workflows, validation, and clear context beat hype. Live demos and open-source tools included.

One API to Rule Them All? LiteLLM in Production

One API to Rule Them All? LiteLLM in Production

2025-09-02 Watch
talk

Using LiteLLM in a Real-World RAG System: What Worked and What Didn’t

LiteLLM provides a unified interface to work with multiple LLM providers—but how well does it hold up in practice? In this talk, I’ll share how we used LiteLLM in a production system to simplify model access and handle token budgets. I’ll outline the benefits, the hidden trade-offs, and the situations where the abstraction helped—or got in the way. This is a practical, developer-focused session on integrating LiteLLM into real workflows, including lessons learned and limitations. If you’re considering LiteLLM, this talk offers a grounded look at using it beyond simple prototypes.

Most AI Agents Are Useless. Let’s Fix That

Most AI Agents Are Useless. Let’s Fix That

2025-09-02 Watch
talk

AI agents are having a moment, but most of them are little more than fragile prototypes that break under pressure. Together, we’ll explore why so many agentic systems fail in practice, and how to fix that with real engineering principles. In this talk, you’ll learn how to build agents that are modular, observable, and ready for production. If you’re tired of LLM demos that don’t deliver, this talk is your blueprint for building agents that actually work.

Probably Fun: Games to teach Machine Learning

Probably Fun: Games to teach Machine Learning

2025-09-02 Watch
talk

In this tutorial, you will play several games that can be used to teach machine learning concepts. Each game can be played in big and small groups. Some involve hands- on material such as cards, some others involve electronic app. All games contain one or more concepts from Machine Learning.

As an outcome, you will take away multiple ideas that make complex topics more understandable – and enjoyable. By doing so, we would like to demonstrate that Machine Learning does not require computers, but the core ideas can be exemplified in a clear and memorable way without. We also would like to demonstrate that gamification is not limited to online quiz questions, but offers ways for learners to bond.

We will bring a set of carefully selected games that have been proven in a big classroom setting and contain useful abstractions of linear models, decision trees, LLMs and several other Machine Learning concepts. We also believe that it is probably fun to participate in this tutorial.

Training Specialized Language Models with Less Data: An End-to-End Practical Guide

Training Specialized Language Models with Less Data: An End-to-End Practical Guide

2025-09-02 Watch
talk

Small Language Models (SLMs) offer an efficient and cost-effective alternative to LLMs—especially when latency, privacy, inference costs or deployment constraints matter. However, training them typically requires large labeled datasets and is time-consuming, even if it isn't your first rodeo.

This talk presents an end-to-end approach for curating high-quality synthetic data using LLMs to train domain-specific SLMs. Using a real-world use case, we’ll demonstrate how to reduce manual labeling time, cut costs, and maintain performance—making SLMs viable for production applications.

Whether you are a seasoned Machine Learning Engineer or a person just getting starting with building AI features, you will come away with the inspiration to build more performant, secure and environmentally-friendly AI systems.

Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

2025-09-01 Watch
talk

Spare Cores is a Python-based, open-source, and vendor-independent ecosystem collecting, generating, and standardizing comprehensive data on cloud server pricing and performance. In our latest project, we started 2000+ server types across five cloud vendors to evaluate their suitability for serving Large Language Models from 135M to 70B parameters. We tested how efficiently models can be loaded into memory of VRAM, and measured inference speed across varying token lengths for prompt processing and text generation. The published data can help you find the optimal instance type for your LLM serving needs, and we will also share our experiences and challenges with the data collection and insights into general patterns.

Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence

Automating Content Creation with LLMs: A Journey from Manual to AI-Driven Excellence

2025-09-01 Watch
talk

In the fast-paced realm of travel experiences, GetYourGuide encountered the challenge of maintaining consistent, high-quality content across its global marketplace. Manual content creation by suppliers often resulted in inconsistencies and errors, negatively impacting conversion rates. To address this, we leveraged large language models (LLMs) to automate content generation, ensuring uniformity and accuracy. This talk will explore our innovative approach, including the development of fine-tuned models for generating key text sections and the use of Function Calling GPT API for structured data. A pivotal aspect of our solution was the creation of an LLM evaluator to detect and correct hallucinations, thereby improving factual accuracy. Through A/B testing, we demonstrated that AI-driven content led to fewer defects and increased bookings. Attendees will gain insights into training data refinement, prompt engineering, and deploying AI at scale, offering valuable lessons for automating content creation across industries.