talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
Scaling AI workloads with Ray & Airflow
2025-06-08 · 15:15
Ray is an open-source framework for scaling Python applications, particularly machine learning and AI workloads. It provides the layer for parallel processing and distributed computing. Many large language models (LLMs), including OpenAI's GPT models, are trained using Ray. On the other hand, Apache Airflow is a consolidated data orchestration framework downloaded more than 20 million times monthly. This talk presents the Airflow Ray provider package that allows users to interact with Ray from an Airflow workflow. In this talk, I'll show how to use the package to create Ray clusters and how Airflow can trigger Ray pipelines in those clusters. |
|
|
Transfer Learning: Leveraging Pretrained Models with Limited Data
2025-06-08 · 15:15
Transfer learning has revolutionised machine learning by enabling models trained on large datasets to generalise effectively to tasks with limited data. This talk explores strategies for adapting pretrained models to new domains, focusing on audio processing as a case study. Using YAMNet, Whisper, and wav2vec2 for laughter detection, we demonstrate how to extract meaningful representations, fine-tune models efficiently, and handle severe class imbalances. The session covers feature extraction, model fusion techniques, and best practices for optimising performance in data-scarce environments. Attendees will gain practical insights into applying transfer learning across various modalities beyond audio, maximising model effectiveness when labelled data is scarce. |
|
|
Polars, DuckDB, PySpark, PyArrow, pandas, cuDF: how Narwhals has brought them all together!
2025-06-08 · 15:15
Suppose you want to write a data science tool to do feature engineering. Your experience may go like this: - Expectation: you can focus on state-of-the art techniques for feature engineering. - Reality: you keep having to make you codebase more complex because a new dataframe library has come out and users are demanding support for it. Or rather, it might have gone like that in the pre-Narwhals era. Because now, you can focus on solving the problems which your tool set out to do, and let Narwhals handle the subtle differences between different kinds of dataframe inputs! |
|
|
Is coding assistant as good as we thought in coding?
2025-06-08 · 14:30
Nowadays coding assistants are everywhere, many IDEs are offering them as plugins, and are becoming more and more powerful. But it prompts us questions, is coding assistant as good as we want it to be? What can and can't these AI agents do? Will AI take my job? |
|
|
You Came to a Python Conference. Now, Go Do a PR Review!
2025-06-08 · 14:30
If you or your organization are spending time and resources attending a Python conference, you will want to ensure your team gets something immediately actionable and helpful out of it. As coders, we often think about writing code as the only way to contribute. However, pull request reviews are an often overlooked, but highly actionable way to have an impact. Giving good PR reviews is an art, with two equally important parts: the technical side and the communication side. While the technical side ensures the quality, maintainability, and efficiency of the Python code, the communication around the PR determines whether the feedback can be understood and acted upon. However, we have all seen code reviews that have been ignored or executed poorly due to poor communication. This talk addresses both facets of PR reviews by introducing the archetypes of bad code reviewers: 1) The “Looks Good to Me” Reviewer: This peer reviewer provides little to no actionable feedback. 2) The “Technical Nitpicker”: This peer reviewer focuses on small Python-specific issues, but fails to communicate constructively. 3) The “Nit” Commenter: This peer reviewer prefaces every comment with “nit,” while offering unclear, yet technically valid suggestions Using these archetypes, we will explore Python-specific technical topics (such as pass by reference vs. pass by value), while delving into how to communicate and deliver feedback in a clear and actionable manner. Using real-world examples, attendees will learn how to: a) Identify and address technical issues in Python PRs b) Communicate feedback effectively c) Balance technical rigor with constructive feedback d) Communicate their peer review comments clearly |
|
|
Building a knowledge graph for climate policy
2025-06-08 · 14:30
At Climate Policy Radar, we're building an open-source knowledge graph for climate policy. In this talk, we'll share how we combine in-house expertise with scalable data infrastructure to identify key concepts in thousands of global climate policy documents. We'll also touch on ontology design, equitable evaluation, and the climate impacts of AI. |
|
|
Debugging Leadership: Six Errors when Moving From Code to Management
2025-06-08 · 13:45
Transitioning from a hands-on Pythonista to a leadership role is a journey filled with challenges, and like debugging code, it requires identifying, isolating, and fixing problems. In this talk, I’ll share eight key lessons from my journey from Data Scientist to Co-Founder of a small software company, framed as Python errors. From battling imposter syndrome (ValueError: self-worth not defined), to learning to delegate (DeadlockError: unable to release control), and avoiding burnout (RuntimeError: system overload), this talk offers actionable advice for anyone navigating the leap from technical contributor to technical leader. Expect a mix of humour, relatable stories, and hard-won lessons as we explore how debugging leadership challenges is just as rewarding (and occasionally frustrating) as debugging code. Whether you’re considering a leadership role or already on the journey, this session will leave you with practical insights to navigate common pitfalls and approach a leadership transition with a clearer understanding of what to expect. |
|
|
Diving into Transformer Model Internals
2025-06-08 · 13:45
While everybody and their dog is building applications on generative AI, the inner workings of transformers - the model architecture behind genAI age - is a mystery for most people. In this talk, I'll walk through how transformers are implemented, using real-life Python code from the HuggingFace transformers library. |
|
|
Humble Data Workshop
2025-06-08 · 13:45
Hugh Evans
– Developer Advocate
@ Imply
Learn Python for Data Science in this Beginners’ Day Workshop Would you like to learn to code but don’t know where to start? Taking your first steps in programming can seem like an impossible task so we’ve decided to put on a workshop to show beginners how it can be done and share our passion for the world of data science! Apply to be a student https://forms.gle/2cvNyRK8c8pNnpnz5 |
|
|
Agentic Cyber Defense with External Threat Intelligence
2025-06-08 · 13:45
This talk will detail how to integrate external threat intelligence data into an autonomous agentic AI system for proactive cybersecurity. Using real world datasets—including open-source threat feeds, security logs, or OSINT—you will learn how to build a data ingestion pipeline, train models with Python, and deploy agents that autonomously detect and mitigate cyber threats. This case study will provide practical insights into data preprocessing, feature engineering, and the challenges of adversarial conditions. |
|
|
Break
2025-06-08 · 13:15
|
|
|
Break
2025-06-08 · 13:15
|
|
|
Break
2025-06-08 · 13:15
|
|
|
Break
2025-06-08 · 13:15
|
|
|
Keynote- Innovation is Dead
2025-06-08 · 12:30
Join us for an exciting Keynote with Tony Mears! |
|
|
Lunch
2025-06-08 · 11:30
|
|
|
PyData Organizers Lunch
2025-06-08 · 11:30
|
|
|
Lunch
2025-06-08 · 11:30
|
|
|
Lunch
2025-06-08 · 11:30
|
|
|
Leaders at PyData
2025-06-08 · 10:45
A self-organised workshop for data leaders to discuss the opportunity and challenges they face with their peers. This is the 9th iteration at a PyData conference. Questions are raised and answered by attendees, it is facilitated by Ian Ozsvald (PyDataLondon co-founder). You are encouraged to carry on talking to fellow leaders after this session, Ian will give out badges to help with this. The format is based on the Breakout discussions that Ian uses in his private RebelAI leadership group, you're welcome and encouraged to copy and use it in your own organisations. Typical attendance is 60+ leaders. The 2022 session using a different format ("Executives at PyData" as it was known) was written up, you can see it here: https://numfocus.medium.com/executives-at-pydata-global-2022-193cbc2d3f3b |
|