PyData London 2025

Scaling AI workloads with Ray & Airflow

2025-06-08 Watch

talk

Tatiana Al-Chueyr

AI/ML Airflow Astronomer GitHub LLM Python

Ray is an open-source framework for scaling Python applications, particularly machine learning and AI workloads. It provides the layer for parallel processing and distributed computing. Many large language models (LLMs), including OpenAI's GPT models, are trained using Ray.

On the other hand, Apache Airflow is a consolidated data orchestration framework downloaded more than 20 million times monthly.

This talk presents the Airflow Ray provider package that allows users to interact with Ray from an Airflow workflow. In this talk, I'll show how to use the package to create Ray clusters and how Airflow can trigger Ray pipelines in those clusters.

You Came to a Python Conference. Now, Go Do a PR Review!

2025-06-08 Watch

talk

Samiul Huque

Python

If you or your organization are spending time and resources attending a Python conference, you will want to ensure your team gets something immediately actionable and helpful out of it. As coders, we often think about writing code as the only way to contribute. However, pull request reviews are an often overlooked, but highly actionable way to have an impact.

Giving good PR reviews is an art, with two equally important parts: the technical side and the communication side. While the technical side ensures the quality, maintainability, and efficiency of the Python code, the communication around the PR determines whether the feedback can be understood and acted upon. However, we have all seen code reviews that have been ignored or executed poorly due to poor communication.

This talk addresses both facets of PR reviews by introducing the archetypes of bad code reviewers: 1) The “Looks Good to Me” Reviewer: This peer reviewer provides little to no actionable feedback. 2) The “Technical Nitpicker”: This peer reviewer focuses on small Python-specific issues, but fails to communicate constructively. 3) The “Nit” Commenter: This peer reviewer prefaces every comment with “nit,” while offering unclear, yet technically valid suggestions

Using these archetypes, we will explore Python-specific technical topics (such as pass by reference vs. pass by value), while delving into how to communicate and deliver feedback in a clear and actionable manner. Using real-world examples, attendees will learn how to: a) Identify and address technical issues in Python PRs b) Communicate feedback effectively c) Balance technical rigor with constructive feedback d) Communicate their peer review comments clearly

Agentic Cyber Defense with External Threat Intelligence

2025-06-08 Watch

talk

Jyoti Yadav

AI/ML Python Cyber Security

This talk will detail how to integrate external threat intelligence data into an autonomous agentic AI system for proactive cybersecurity. Using real world datasets—including open-source threat feeds, security logs, or OSINT—you will learn how to build a data ingestion pipeline, train models with Python, and deploy agents that autonomously detect and mitigate cyber threats. This case study will provide practical insights into data preprocessing, feature engineering, and the challenges of adversarial conditions.

Debugging Leadership: Six Errors when Moving From Code to Management

2025-06-08

talk

Matt Upson

Python

Transitioning from a hands-on Pythonista to a leadership role is a journey filled with challenges, and like debugging code, it requires identifying, isolating, and fixing problems. In this talk, I’ll share eight key lessons from my journey from Data Scientist to Co-Founder of a small software company, framed as Python errors.

From battling imposter syndrome (ValueError: self-worth not defined), to learning to delegate (DeadlockError: unable to release control), and avoiding burnout (RuntimeError: system overload), this talk offers actionable advice for anyone navigating the leap from technical contributor to technical leader.

Expect a mix of humour, relatable stories, and hard-won lessons as we explore how debugging leadership challenges is just as rewarding (and occasionally frustrating) as debugging code. Whether you’re considering a leadership role or already on the journey, this session will leave you with practical insights to navigate common pitfalls and approach a leadership transition with a clearer understanding of what to expect.

Diving into Transformer Model Internals

2025-06-08 Watch

talk

Matt Squire

AI/ML GenAI Python

While everybody and their dog is building applications on generative AI, the inner workings of transformers - the model architecture behind genAI age - is a mystery for most people. In this talk, I'll walk through how transformers are implemented, using real-life Python code from the HuggingFace transformers library.

Humble Data Workshop

2025-06-08

talk

Hugh Evans (Imply)

Data Science Python

Learn Python for Data Science in this Beginners’ Day Workshop Would you like to learn to code but don’t know where to start? Taking your first steps in programming can seem like an impossible task so we’ve decided to put on a workshop to show beginners how it can be done and share our passion for the world of data science!

Apply to be a student https://forms.gle/2cvNyRK8c8pNnpnz5

CUDA in Python: A New Era for GPU Acceleration

2025-06-08 Watch

talk

Andy Terrel

Python

We discuss bringing Python natively to the CUDA ecosystem. From low level bindings to domain specific applications, CUDA is supporting Python standards and ecosystem. New libraries include nvmath-python for managing optimized mathematics libraries, cccl-python for cooperative threading and device parallelism, cuda-core for managing the complete CUDA toolstack from Python with no need for C++, and finally numba-cuda for generating device side kernels with integration of C++ device libraries and LTO IR.

Git Commit, MedTech Transformed: Python’s Medical Robotics Breakthrough

2025-06-08

talk

Lilinoe Harbottle

Git Python

Code changing lives? Absolutely. We're diving into Python's power to deploy cutting-edge solutions for lung cancer diagnosis and treatment in medical and surgical robotics. Expect demos showcasing algorithms, data analysis, and real-world impact—bridging MedTech innovation and life-changing solutions. Ready to see Python revolutionize lung health? Join us. Let's code a healthier future together!

Python Engineering Excellence Birds of a Feather

2025-06-07

talk

Sam Joseph

Python

A round table discussion on how to excel at Python engineering and architecting systems using Python, what kind of sessions and activities would best help support Python programmers be more effective at Python engineering, and how to achieve Python engineering excellence generally.

Conquering PDFs: document understanding beyond plain text

2025-06-07 Watch

talk

Ines Montani

Data Science NLP Python

NLP and data science could be so easy if all of our data came as clean and plain text. But in practice, a lot of it is hidden away in PDFs, Word documents, scans and other formats that have been a nightmare to work with. In this talk, I'll present a new and modular approach for building robust document understanding systems, using state-of-the-art models and the awesome Python ecosystem. I'll show you how you can go from PDFs to structured data and even build fully custom information extraction pipelines for your specific use case.

PyScript - Python in the Browser

2025-06-07 Watch

talk

Chris Laffra

Python

Learn how to write a web app in Python using PyScript, PyOdide, MicroPython, and WASM.

Tackling Data Challenges for Scaling Multi-Agent GenAI Apps with Python

2025-06-07 Watch

talk

Theo van Kraay

API Azure Cosmos GenAI LLM Python

The use of multiple Large Language Models (LLMs) working together perform complex tasks, known as multi-agent systems, has gained significant traction. While orchestration frameworks like LangGraph and Semantic Kernel can streamline orchestration and coordination among agents, developing large-scale, production-grade systems can bring a host of data challenges. Issues such as supporting multi-tenancy, preserving transactional integrity and state, and managing reliable asynchronous function calls while scaling efficiently can be difficult to navigate.

Leveraging insights from practical experiences in the Azure Cosmos DB engineering team, this talk will guide you through key considerations and best practices for storing, managing, and leveraging data in multi-agent applications at any scale. You’ll learn how to understand core multi-agent concepts and architectures, manage statefulness and conversation histories, personalize agents through retrieval-augmented generation (RAG), and effectively integrate APIs and function calls.

Aimed at developers, architects, and data scientists at all skill levels, this session will show you how to take your multi-agent systems from the lab to full-scale production deployments, ready to solve real-world problems. We’ll also walk through code implementations that can be quickly and easily put into practice, all in Python.

Bringing stories to life with AI, data streaming and generative agents

2025-06-07

talk

Olena Kutsenko

AI/ML Flink Iceberg Kafka LLM Python

Explore how AI-powered Generative Agents can evolve in real time using live data streams. Inspired by Stanford's 'Generative Agents' paper, this session dives into building dynamic, AI-driven worlds with Apache Kafka, Flink, and Iceberg - plus LLMs, RAG, and Python. Demos and practical examples included!

Sovereign Data for AI with Python

2025-06-07 Watch

talk

Lex Avstreikh

AI/ML Cloud Computing LLM Python S3

The only certainty in life is that the pendulum will always swing. Recently, the pendulum has been swinging towards repatriation. However, the infrastructure needed to build and operate AI systems using Python in a sovereign (even air-gapped) environment has changed since the shift towards the cloud. This talk will introduce the infrastructure you need to build and deploy Python applications for AI - from data processing, to model training and LLM fine-tuning at scale to inference at scale. We will focus on open-source infrastructure including: a Python library server (Pypi, Conda, etc) and avoiding supply chain attacks a container registry that works at scale a S3 storage layer a database server with a vector index

Parallel PyTorch Inference with Python Free-Threading

2025-06-07 Watch

talk

Michał Szołucha

Python PyTorch

This talk examines multi-threaded parallel inference on PyTorch models using the new No-GIL, free-threaded version of Python. Using a simple 124M parameter GPT2 model that we train from scratch, we explore the novel new territory unlocked by free-threaded Python: parallel PyTorch model inference, where multiple threads, unimpeded by the Python GIL, attempt to generate text from a transformer-based model in parallel.

Python Meets Quantum: Learn, Code, and Simulate

2025-06-06

talk

Andrea Melloncelli

Python

This workshop is designed for Python developers eager to explore the exciting world of quantum computing. Through interactive exercises and practical coding examples, participants will learn how to program quantum computers using Python. No advanced background in quantum mechanics is required - just curiosity and a willingness to dive into cutting-edge technology.

Forecasting Weather using Time Series ML

2025-06-06 Watch

talk

Suyash Joshi

AI/ML LLM Python

This hands-on workshop covers how to use open source ML models like LSTMs and TimeSeries LLM's, with Python to try to forecast weather patterns, with best practices for data preparation and real time predictions.

Package Your Python Code as a CLI

2025-06-06 Watch

talk

Jeroen Janssens , Thijs Nieuwdorp (VodafoneZiggo)

Data Science Python Unix

Learn how to transform your Python code into a command-line tool. Jeroen Janssens, author of Data Science at the Command Line, guides you through the process of turning your scripts into reusable, executable tools, integrating them into your data workflows and harnessing the power of the Unix command line.

GPU Accelerated Python

2025-06-06 Watch

talk

Lawrence Mitchell , Jeremy Tanner , Katrina Riehl , Jacob Tomlinson

Python

Accelerating Python using the GPU is much easier than you might think. We will explore the powerful CUDA-enabled Python ecosystem in this tutorial through hands-on examples using some of the most popular accelerated scientific computing libraries.

Topics include: - Introduction to General Purpose GPU Computing - GPU vs CPU - Which processor is best for which tasks - Introduction to CUDA - How to use CUDA with Python - Using Numba to write kernel functions - CuPy - cuDF

No prior experience with GPU's is necessary, but attendees should be familiar with Python.

Introduction to Bayesian Time Series Analysis with PyMC

2025-06-06

talk

Chris Fonnesbeck

Python

Time series data is ubiquitous, from stock market prices and weather patterns to disease outbreaks and sports outcomes. Accurately modeling these data and generating useful predictions requires specialized techniques due to the unique characteristics of time series data. This tutorial provides a practical introduction to Bayesian time series analysis using PyMC, a powerful probabilistic programming library in Python. Participants will learn how to build, evaluate, and interpret various Bayesian time series models, including ARIMA models, dynamic linear models, and stochastic volatility models. We'll emphasize practical application, covering data preprocessing, model selection, diagnostics, and forecasting, empowering attendees to tackle real-world time series problems with confidence.

talk-data.com

Top Topics

Top Speakers

Scaling AI workloads with Ray & Airflow

You Came to a Python Conference. Now, Go Do a PR Review!

Agentic Cyber Defense with External Threat Intelligence

Debugging Leadership: Six Errors when Moving From Code to Management

Diving into Transformer Model Internals

Humble Data Workshop

CUDA in Python: A New Era for GPU Acceleration

Git Commit, MedTech Transformed: Python’s Medical Robotics Breakthrough

Python Engineering Excellence Birds of a Feather

Conquering PDFs: document understanding beyond plain text

PyScript - Python in the Browser

Tackling Data Challenges for Scaling Multi-Agent GenAI Apps with Python

Bringing stories to life with AI, data streaming and generative agents

Sovereign Data for AI with Python

Parallel PyTorch Inference with Python Free-Threading

Python Meets Quantum: Learn, Code, and Simulate

Forecasting Weather using Time Series ML

Package Your Python Code as a CLI

GPU Accelerated Python

Introduction to Bayesian Time Series Analysis with PyMC