talk-data.com

Event

PyData London 2025

2025-06-06 – 2025-06-08 PyData

Activities tracked

104

Top Speakers

Ines Montani 2 Chris Fonnesbeck 2 Ian Ozsvald 2 Sam Joseph 2 Jacob Tomlinson 1 Katrina Riehl 1 Andy Terrel 1 Cheuk Ting Ho 1 Theodore Meynard 1 Tim Paine 1 Adam Hill 1 Chris Laffra 1

Sessions & talks

Showing 51–75 of 104 · Newest first

Search within this event →

Break

2025-06-07

talk

Break

2025-06-07

talk

Break

2025-06-07

talk

Break

2025-06-07

talk

Keynote- From Next Token Prediction to Reasoning and Beyond

2025-06-07 Watch

talk

Jay Alammar (Cohere)

LLM

Large Language Models (LLMs) have grown into prominence as some of the most popular technological artifacts of the day. This talk will provide a highly accessible and visual overview of LLM concepts relevant to today's data professionals. This includes looking at present-day Transformer architectures, tokenizers, reward models, reasoning LLMs, agentic trajectories, and the various training stages of a large language model including next-word prediction, instruction-tuning, preference-tuning, and reinforcement learning.

Diversity Scholar Luncheon

2025-06-07

talk

Lunch Break

2025-06-07

talk

Lunch Break

2025-06-07

talk

Lunch Break

2025-06-07

talk

Bringing stories to life with AI, data streaming and generative agents

2025-06-07

talk

Olena Kutsenko

AI/ML Flink Iceberg Kafka LLM Python

Explore how AI-powered Generative Agents can evolve in real time using live data streams. Inspired by Stanford's 'Generative Agents' paper, this session dives into building dynamic, AI-driven worlds with Apache Kafka, Flink, and Iceberg - plus LLMs, RAG, and Python. Demos and practical examples included!

Cutting Edge Football Analytics using Polars, Keras and Spektral

2025-06-07 Watch

talk

Joris Bekkers

Analytics Keras Polars

Football analytics has rapidly evolved over the past five years, becoming a crucial part of professional and fan discourse. While much of the cutting-edge research remains hidden behind the fences of club training grounds, a growing ecosystem of open-source tools now enables anyone to develop advanced football analytics models.

In this talk, I'll showcase key open-source libraries—Polars for high-performance data processing, Keras for deep learning, and Spektral for Graph Neural Networks (GNNs)—to analyze millions of player coordinates from publicly available high-frequency positional tracking data. I'll demonstrate how these tools can be used to build in-game prediction models and extract advanced football metrics that only the most advanced football clubs currently use.

Enhancing Fraud Detection with LLM-Generated Profiles: From Analyst Efficiency to Model Performance

2025-06-07

talk

Radion Bikmukhamedov

LLM NLP

This talk explores how leveraging Large Language Models (LLMs) to generate structured customer profile summaries improved both compliance analyst workflows and fraud scoring models at a financial institution. Attendees will learn how embeddings derived from LLM-generated narratives outperformed traditional manual feature engineering and raw text embeddings, offering insights into practical applications of NLP in fraud detection.

AI agents testing: How to evaluate the unpredictable

2025-06-07 Watch

talk

Emeli Dral

AI/ML LLM

AI agents and multi-step workflows are powerful, but testing them can be tricky. This talk explores practical ways to test these complex systems — like running multi-step simulations, checking tool calls, and using LLMs for evaluation. You'll also learn how to prioritize what to test and set up session-level evaluations with open-source tools.

How we unified feature engineering across data and backend at Monzo

2025-06-07 Watch

talk

Alex Jones

Data Streaming

Deep dive into how Monzo reduced the effort it takes to generate point-in-time correct features for model development and productionise them with realtime streaming using our event-driven architecture.

Sovereign Data for AI with Python

2025-06-07 Watch

talk

Lex Avstreikh

AI/ML Cloud Computing LLM Python S3

The only certainty in life is that the pendulum will always swing. Recently, the pendulum has been swinging towards repatriation. However, the infrastructure needed to build and operate AI systems using Python in a sovereign (even air-gapped) environment has changed since the shift towards the cloud. This talk will introduce the infrastructure you need to build and deploy Python applications for AI - from data processing, to model training and LLM fine-tuning at scale to inference at scale. We will focus on open-source infrastructure including: a Python library server (Pypi, Conda, etc) and avoiding supply chain attacks a container registry that works at scale a S3 storage layer a database server with a vector index

Multi-Task Learning for Fraud detection: From Trees to MLPs

2025-06-07 Watch

talk

Callum Court

This talk will present Monzo's exploration of multi-task deep learning to enhance our real-time fraud detection systems. I will outline the challenges of card fraud detection, and explain the limitations of traditional gradient boosted decision tree models in terms of generalisation to rare fraud subtypes. This will motivate the use of multi-task learning, which leverages shared dense representations across fraud sub-tasks. By consolidating multiple specialist learners into a single model, we observe improved performance on less prevalent fraud types, leading to better generalisability, scalability, and robustness. I will also share results from testing multi-task models within our fraud detection infrastructure.

Parallel PyTorch Inference with Python Free-Threading

2025-06-07 Watch

talk

Michał Szołucha

Python PyTorch

This talk examines multi-threaded parallel inference on PyTorch models using the new No-GIL, free-threaded version of Python. Using a simple 124M parameter GPT2 model that we train from scratch, we explore the novel new territory unlocked by free-threaded Python: parallel PyTorch model inference, where multiple threads, unimpeded by the Python GIL, attempt to generate text from a transformer-based model in parallel.

PyMC Code Sprint

2025-06-07

talk

Chris Fonnesbeck

Join the PyMC development team for a fun and engaging hackathon!

Why you should stop pretending your sparse data is dense

2025-06-07 Watch

talk

Alex Owens

Arrow Data Science

Lots of data in the real world has missing values, but historically prevalent data science tools have had limited support for such data. This talk will compare traditional numerical approaches, the more modern alternative Arrow, as well as ArcticDB, the client-side Dataframe database developed at Man Group.

Break

2025-06-07

talk

Break

2025-06-07

talk

Break

2025-06-07

talk

Break

2025-06-07

talk

Opening Notes & Keynote: Keep Calm and Data On: Being a data science practitioner in the era of AI proliferation

2025-06-07

talk

Leanne Fitzpatrick

AI/ML Data Science

Since the end of 2022, the AI space has reached unprecedented velocity, scale and proliferation. When it seems like everyone (and their dog) is talking about AI, how should those of us who've been working in Machine Learning, Data Science (and AI) as domain experts look to navigate the conversation? In this talk, Leanne will aim to shine a light on the impact the AI arms race is having on our field, the reality of what it means to be a practitioner and some principles to stick by to help traverse what may appear to be a time of panic.

Registration & Breakfast

2025-06-07

talk

Page 3 of 5

← Previous

1 2 3 4 5

PyData London 2025

Top Topics

Top Speakers