talk-data.com talk-data.com

Event

PyData London 2025

2025-06-06 – 2025-06-08 PyData

Activities tracked

104

Sessions & talks

Showing 1–25 of 104 · Newest first

Search within this event →
Polars, DuckDB, PySpark, PyArrow, pandas, cuDF: how Narwhals has brought them all together!

Polars, DuckDB, PySpark, PyArrow, pandas, cuDF: how Narwhals has brought them all together!

2025-06-08 Watch
talk

Suppose you want to write a data science tool to do feature engineering. Your experience may go like this: - Expectation: you can focus on state-of-the art techniques for feature engineering. - Reality: you keep having to make you codebase more complex because a new dataframe library has come out and users are demanding support for it.

Or rather, it might have gone like that in the pre-Narwhals era. Because now, you can focus on solving the problems which your tool set out to do, and let Narwhals handle the subtle differences between different kinds of dataframe inputs!

Scaling AI workloads with Ray & Airflow

Scaling AI workloads with Ray & Airflow

2025-06-08 Watch
talk

Ray is an open-source framework for scaling Python applications, particularly machine learning and AI workloads. It provides the layer for parallel processing and distributed computing. Many large language models (LLMs), including OpenAI's GPT models, are trained using Ray.

On the other hand, Apache Airflow is a consolidated data orchestration framework downloaded more than 20 million times monthly.

This talk presents the Airflow Ray provider package that allows users to interact with Ray from an Airflow workflow. In this talk, I'll show how to use the package to create Ray clusters and how Airflow can trigger Ray pipelines in those clusters.

Transfer Learning: Leveraging Pretrained Models with Limited Data

Transfer Learning: Leveraging Pretrained Models with Limited Data

2025-06-08 Watch
talk

Transfer learning has revolutionised machine learning by enabling models trained on large datasets to generalise effectively to tasks with limited data. This talk explores strategies for adapting pretrained models to new domains, focusing on audio processing as a case study. Using YAMNet, Whisper, and wav2vec2 for laughter detection, we demonstrate how to extract meaningful representations, fine-tune models efficiently, and handle severe class imbalances. The session covers feature extraction, model fusion techniques, and best practices for optimising performance in data-scarce environments. Attendees will gain practical insights into applying transfer learning across various modalities beyond audio, maximising model effectiveness when labelled data is scarce.

Building a knowledge graph for climate policy

Building a knowledge graph for climate policy

2025-06-08 Watch
talk

At Climate Policy Radar, we're building an open-source knowledge graph for climate policy. In this talk, we'll share how we combine in-house expertise with scalable data infrastructure to identify key concepts in thousands of global climate policy documents. We'll also touch on ontology design, equitable evaluation, and the climate impacts of AI.

Is coding assistant as good as we thought in coding?

Is coding assistant as good as we thought in coding?

2025-06-08 Watch
talk

Nowadays coding assistants are everywhere, many IDEs are offering them as plugins, and are becoming more and more powerful. But it prompts us questions, is coding assistant as good as we want it to be? What can and can't these AI agents do? Will AI take my job?

You Came to a Python Conference. Now, Go Do a PR Review!

You Came to a Python Conference. Now, Go Do a PR Review!

2025-06-08 Watch
talk

If you or your organization are spending time and resources attending a Python conference, you will want to ensure your team gets something immediately actionable and helpful out of it. As coders, we often think about writing code as the only way to contribute. However, pull request reviews are an often overlooked, but highly actionable way to have an impact.

Giving good PR reviews is an art, with two equally important parts: the technical side and the communication side. While the technical side ensures the quality, maintainability, and efficiency of the Python code, the communication around the PR determines whether the feedback can be understood and acted upon. However, we have all seen code reviews that have been ignored or executed poorly due to poor communication.

This talk addresses both facets of PR reviews by introducing the archetypes of bad code reviewers: 1) The “Looks Good to Me” Reviewer: This peer reviewer provides little to no actionable feedback. 2) The “Technical Nitpicker”: This peer reviewer focuses on small Python-specific issues, but fails to communicate constructively. 3) The “Nit” Commenter: This peer reviewer prefaces every comment with “nit,” while offering unclear, yet technically valid suggestions

Using these archetypes, we will explore Python-specific technical topics (such as pass by reference vs. pass by value), while delving into how to communicate and deliver feedback in a clear and actionable manner. Using real-world examples, attendees will learn how to: a) Identify and address technical issues in Python PRs b) Communicate feedback effectively c) Balance technical rigor with constructive feedback d) Communicate their peer review comments clearly

Agentic Cyber Defense with External Threat Intelligence

Agentic Cyber Defense with External Threat Intelligence

2025-06-08 Watch
talk

This talk will detail how to integrate external threat intelligence data into an autonomous agentic AI system for proactive cybersecurity. Using real world datasets—including open-source threat feeds, security logs, or OSINT—you will learn how to build a data ingestion pipeline, train models with Python, and deploy agents that autonomously detect and mitigate cyber threats. This case study will provide practical insights into data preprocessing, feature engineering, and the challenges of adversarial conditions.

Debugging Leadership: Six Errors when Moving From Code to Management

2025-06-08
talk

Transitioning from a hands-on Pythonista to a leadership role is a journey filled with challenges, and like debugging code, it requires identifying, isolating, and fixing problems. In this talk, I’ll share eight key lessons from my journey from Data Scientist to Co-Founder of a small software company, framed as Python errors.

From battling imposter syndrome (ValueError: self-worth not defined), to learning to delegate (DeadlockError: unable to release control), and avoiding burnout (RuntimeError: system overload), this talk offers actionable advice for anyone navigating the leap from technical contributor to technical leader.

Expect a mix of humour, relatable stories, and hard-won lessons as we explore how debugging leadership challenges is just as rewarding (and occasionally frustrating) as debugging code. Whether you’re considering a leadership role or already on the journey, this session will leave you with practical insights to navigate common pitfalls and approach a leadership transition with a clearer understanding of what to expect.

Diving into Transformer Model Internals

Diving into Transformer Model Internals

2025-06-08 Watch
talk

While everybody and their dog is building applications on generative AI, the inner workings of transformers - the model architecture behind genAI age - is a mystery for most people. In this talk, I'll walk through how transformers are implemented, using real-life Python code from the HuggingFace transformers library.

Humble Data Workshop

2025-06-08
talk
Hugh Evans (Imply)

Learn Python for Data Science in this Beginners’ Day Workshop Would you like to learn to code but don’t know where to start? Taking your first steps in programming can seem like an impossible task so we’ve decided to put on a workshop to show beginners how it can be done and share our passion for the world of data science!

Apply to be a student https://forms.gle/2cvNyRK8c8pNnpnz5

Break

2025-06-08
talk

Break

2025-06-08
talk

Break

2025-06-08
talk

Break

2025-06-08
talk

Keynote- Innovation is Dead

2025-06-08
talk

Join us for an exciting Keynote with Tony Mears!

Lunch

2025-06-08
talk

Lunch

2025-06-08
talk

Lunch

2025-06-08
talk

PyData Organizers Lunch

2025-06-08
talk
Analysing smart meter data to uncover energy consumption patterns

Analysing smart meter data to uncover energy consumption patterns

2025-06-08 Watch
talk

Smart meters have the potential to not only provide information to individual householders about their energy consumption, but to identify patterns of usage across the entire energy system. At Nesta, we have been analysing smart meter data to uncover information about energy consumption habits, and how household appliances, physical property characteristics and demographic factors influence energy usage - as this can help develop energy-saving initiatives. In this talk we will present the data science techniques we used, such as clustering, present our results as well as discuss how we translate them to a non-data science audience, and share learnings of conducting data science work in a secure data lab to allow for analysis of sensitive and confidential data.

CUDA in Python: A New Era for GPU Acceleration

CUDA in Python: A New Era for GPU Acceleration

2025-06-08 Watch
talk

We discuss bringing Python natively to the CUDA ecosystem. From low level bindings to domain specific applications, CUDA is supporting Python standards and ecosystem. New libraries include nvmath-python for managing optimized mathematics libraries, cccl-python for cooperative threading and device parallelism, cuda-core for managing the complete CUDA toolstack from Python with no need for C++, and finally numba-cuda for generating device side kernels with integration of C++ device libraries and LTO IR.

Git Commit, MedTech Transformed: Python’s Medical Robotics Breakthrough

2025-06-08
talk

Code changing lives? Absolutely. We're diving into Python's power to deploy cutting-edge solutions for lung cancer diagnosis and treatment in medical and surgical robotics. Expect demos showcasing algorithms, data analysis, and real-world impact—bridging MedTech innovation and life-changing solutions. Ready to see Python revolutionize lung health? Join us. Let's code a healthier future together!

Leaders at PyData

Leaders at PyData

2025-06-08 Watch
talk

A self-organised workshop for data leaders to discuss the opportunity and challenges they face with their peers. This is the 9th iteration at a PyData conference. Questions are raised and answered by attendees, it is facilitated by Ian Ozsvald (PyDataLondon co-founder). You are encouraged to carry on talking to fellow leaders after this session, Ian will give out badges to help with this.

The format is based on the Breakout discussions that Ian uses in his private RebelAI leadership group, you're welcome and encouraged to copy and use it in your own organisations. Typical attendance is 60+ leaders.

The 2022 session using a different format ("Executives at PyData" as it was known) was written up, you can see it here: https://numfocus.medium.com/executives-at-pydata-global-2022-193cbc2d3f3b

Making LLMs reliable: A practical framework for production

Making LLMs reliable: A practical framework for production

2025-06-08 Watch
talk
LLM

LLM outputs are non-deterministic, making it difficult to ensure reliability in production, especially in high-risk applications. In this talk, we’ll walk through a structured approach to making LLMs production-ready. We’ll cover setting up tests during experimentation, implementing real-time guardrails before responses reach users, and monitoring live performance for critical issues. Finally, we’ll discuss post-deployment log analysis to drive continuous improvements and build trust with stakeholders.

One repo to rule them all, one repo to bind them...Control all of your projects with copier!

2025-06-08
talk

Did you know you can control all of your projects from a central template repository? In this talk we'll learn about copier, a framework for creating project templates. A natural successor to cookiecutter and GitHub templates, copier lets your projects re-sync from the original template, with new or the same arguments. Adopt the latest and greatest tools without leaving any of your libraries behind!