talk-data.com
People (1 result)
Activities & events
| Title & Speakers | Event |
|---|---|
|
PyData Berlin 2025 May Meetup
2025-05-21 · 17:00
Welcome to the PyData Berlin May meetup! We would like to welcome you all starting from 18:45. There will be food and drinks. The talks begin around 19.30 and the doors will close at 19:30. Make sure to arrive on time! Please provide your first and last name for the registration because this is required for the venue's entry policy. If you cannot attend, please cancel your spot so others are able to join as the space is limited. Host: Ecosia is excited to welcome you to this month's version of PyData. Entrance is in Hof 4 - there will be signs - then up to the 3rd floor of the building. ************************************************************************** The Lineup for the evening Talk 1: Specializing Small Language Models With Less Data Abstract: I will present a practical, end-to-end solution for training SLMs using synthetic data, covering key aspects from data curation through training to model evaluation. You will leave with concrete strategies for building efficient, domain-specific language models for production environments. Most AI teams are exploring the possibilities of LLMs rather than being focused on margins, but soon, efficiency will become important. Small, specialized language models (SLMs) offer a promising alternative, but training them requires extensive manually-labeled datasets - a significant engineering bottleneck. In this talk, I will discuss how large language models can be used to help generate and curate the data needed for SLM training. Using extractive question answering as a case study, We'll examine how this approach can dramatically reduce data collection time while maintaining model performance. Speaker: Jacek Golebiowski Bio: Jacek is the CTO of distil labs, building specialised AI agents that can be deployed on-device/on-prem with minimal data. Before that, he was a machine learning team lead at AWS, focused on Automated ML and natural language processing. He holds a PhD in Machine Learning for Quantum Mechanics from Imperial College London. --- Talk 2: Exploring fairlearn and practical strategies for assessing and mitigating harm in AI systems Abstract: As AI becomes a more significant part of our everyday lives, ensuring these systems are fair is more important than ever. In this session, we’ll discuss how to define fairness and the potential harms our algorithms can have on people and society. We'll introduce fairlearn, a community-driven, open-source project that offers practical tools for assessing and mitigating harm in AI systems. We’ll also explore how to discuss bias, different types of harm, the idea of group fairness and how they all relate to fairlearn's toolkit. To make it all concrete, we’ll walk through a real-world example of assessing fairness and share some hands-on strategies you can use to mitigate harm in your own ML projects. Speaker: Tamara Atanasoska Bio: Tamara is a software engineer, OSS contributor and maintainer and NLP researcher. --- Lightning talks There will be slots for 2-3 Lightning Talks (3-5 Minutes for each). Kindly let us know if you would like to present something at the start of the meetup :) *** NumFOCUS Code of Conduct THE SHORT VERSION Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate for NumFOCUS. All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery are not appropriate. NumFOCUS is dedicated to providing a harassment-free community for everyone, regardless of gender, sexual orientation, gender identity, and expression, disability, physical appearance, body size, race, or religion. We do not tolerate harassment of community members in any form. Thank you for helping make this a welcoming, friendly community for all. If you haven't yet, please read the detailed version here: https://numfocus.org/code-of-conduct *** |
PyData Berlin 2025 May Meetup
|
|
Introduction to Fairlearn - what you can contribute and how to contribute
2025-02-19 · 19:00
Tamara Atanasoska
– Open Source Software Engineer
@ :probably..
,
Adrin Jalali
– scikit-learn and Fairlearn maintainer
@ scikit-learn and Fairlearn
Learn how you can contribute to Fairlearn and how to contribute. |
|
|
Introduction to scikit-learn - what you can contribute and how to contribute
2025-02-19 · 18:50
Guillaume Lemaitre
– scikit-learn maintainer
@ scikit-learn
,
Stefanie Senger
– Open source developer
@ :probabl.
,
Maren Westermann
– scikit-learn team member
@ PyLadies Berlin
Learn what you can contribute to scikit-learn and how to contribute. |
|
|
Let's contribute to scikit-learn and Fairlearn! Optional preparation event
2025-02-12 · 17:00
PyLadies Berlin are excited to bring you this open source workshop dedicated to setting up your scikit-learn or fairlearn development environment. scikit-learn is a popular machine learning library and is widely adopted in industry as well as academia. Fairlearn is a community-driven project to help data scientists improve fairness of AI systems. This is a warm-up session for the upcoming workshop, but it is open to anyone who would like to get guidance, even if you won’t attend the workshop. No prior contributing experience required! By setting up your development environment in advance, you can use the in-person workshop time for finding an issue to work on, and working on a contribution. • Format for the session: First 10 minutes : welcome and introduction The rest : "office hours" during which you can ask questions and where you'll be supported with setting up a development environment. • Preparation work for scikit-learn To get the most out of the session, it's encouraged that you check out the Developer's Guide of scikit-learn and follow steps 1 to 7 under the section "How to contribute": https://scikit-learn.org/dev/developers/contributing.html#how-to-contribute. You will also find a lot of useful additional information on this page, for example video resources: https://scikit-learn.org/dev/developers/contributing.html#video-resources. Please be aware that it could take longer to set up a development environment on a computer running a Windows operating system compared to MacOS or Unix. If you are using Windows, it is recommended to install Windows Subsystem for Linux. You can find instructions on the installation process for example by following the steps described here: https://learn.microsoft.com/en-us/windows/wsl/install. Please note that you should be using WSL 2, and here's how you can upgrade to this version if you have WSL 1: https://learn.microsoft.com/en-us/windows/wsl/install#upgrade-version-from-wsl-1-to-wsl-2 • Preparation work for Fairlearn You don't need to do any preparation work for Fairlearn for this session, instructions will be given during this event. However, it is recommended that you have a look at the Fairlearn Contributor Guide here: https://fairlearn.org/v0.12/contributor_guide/index.html • How to join We'll be using Discord for this event. Please follow the steps below: 1. Join the scikit-learn Discord server: https://discord.gg/aBgkfXBtWZ 2. Join the #help-desk-voice channel. This is the channel where we'll be hosting this workshop. • Audience level Everyone is welcome to attend this session! If you've never contributed to open source software before, then you will learn how to, and if you have experience contributing, then you can either help mentor other attendees or you can work on more challenging contributions. It is useful to have some scikit-learn, git, and python experience. • Facilitators The session will be lead by Maren Westermann (PyLadies Berlin, scikit-learn team member), Tamara Atanasoska (scikit-learn contributor, Fairlearn maintainer), Stefanie Senger (scikit-learn team member), Adrin Jalali (scikit-learn and Fairlearn maintainer) and Guillaume Lemaitre (scikit-learn maintainer). • By attending our event, you agree to the PyLadies Code of Conduct: https://www.pyladies.com/CodeOfConduct/ ❓ Can men attend ❓ Everyone is welcome. If you identify as someone well-represented in open source and in tech, please be mindful of the space and privileges you have, and use it to support others. • Contact Interested in speaking at one of our events? Have a good idea for a Meetup? Get in touch with us at [email protected] Find us on the PyLadies Global workspace: 1. https://slackin.pyladies.com enter your email address. 2. Accept the email invitation 3. Go to workspace https://pyladies.slack.com 4. Join channel #city-berlin\, #germany\, #jobs-europe |
Let's contribute to scikit-learn and Fairlearn! Optional preparation event
|
|
Linguistics and Fairness - Tamara Atanasoska
2025-01-17 · 18:00
Tamara Atanasoska
– Open Source Software Engineer
@ Probable
In this podcast episode, we talked with Tamara Atanasoska about building fair AI systems. About the Speaker:Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emphasized the importance of tools, human judgment, and collaboration. The guest also shared their involvement with PyLadies and encouraged contributions to Fairlearn. 00:00 Introduction to the event and the community 01:51 Topic introduction: Linguistic fairness and socio-technical perspectives in AI 02:37 Guest introduction: Tamara’s background and career 03:18 Tamara’s career journey: Software engineering, music tech, and computational linguistics 09:53 Tamara’s background in language and computer science 14:52 Exploring fairness in AI and its impact on society 21:20 Fairness in AI models26:21 Automating fairness analysis in models 32:32 Balancing technical and domain expertise in decision-making 37:13 The role of humans in the loop for fairness 40:02 Joining Probable and working on open-source projects 46:20 Scopes library and its integration with Hugging Face 50:48 PyLadies and community involvement 55:41 The ethos of Scikit-learn and Fairlearn 🔗 CONNECT WITH TAMARA ATANASOSKA Linkedin - https://www.linkedin.com/in/tamaraatanasoska GitHub- https://github.com/TamaraAtanasoska 🔗 CONNECT WITH DataTalksClub Join DataTalks.Club:https://datatalks.club/slack.html Our events:https://datatalks.club/events.html Datalike Substack -https://datalike.substack.com/ LinkedIn: / datatalks-club |
DataTalks.Club |
|
Predicting Delivery Risks with Machine Learning: A TrustCourier Innovation
2025-01-14 · 20:35
Claudia Stangarone
– Data analyst
@ GLS Studio
,
Helen FitzGerald
– Data analyst
Claims in logistics, especially for lost parcels or delivery issues, can be a significant cost for companies. In this talk, we’ll present the framework and share some early results of a new feature within TrustCourier. This feature uses machine learning to predict and flag high-risk deliveries before they escalate into costly claims. |
|
|
Writing a custom scikit-learn estimator
2025-01-14 · 19:25
Tamara Atanasoska
– Open Source Software Engineer
@ :probably..
Scikit-learn is a popular machine learning library. It currently has over 200 estimators ready to use for a vast array of use cases. What if you are working on something special that still hasn't found its way into the library? Scikit-learn offers a way to write new compatible estimators, which can be seamlessly integrated with the rest of the library. We will look into what an estimator is, what API that scikit-learn estimators have, reasons why you would like to implement your own and an example of how to. We will end with real-world examples of how other OSS projects use this for their needs. |
|
|
Contributing to OpenSource - how to get started in 5 minutes!
2025-01-14 · 19:20
Stefanie Senger
– Open source developer
@ :probabl.
This talk will introduce scikit-learn users to the new API for metadata routing, a feature introduced in the recent releases and almost fully available since version 1.5 (released in May 2024). |
|
|
Linguistics and Fairness
2024-12-10 · 11:30
Building fair AI systems -Tamara Atanasoska Outline:
About the speaker: Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at :probabl.. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background. Join our slack: https://datatalks.club/slack.html |
Linguistics and Fairness
|
|
How to be a good reviewer: Be honest, nice, and a badass
2023-07-11 · 20:00
Raana Saheb Nassagh
– Intermediate Python developer
The first time I was asked to review a colleague's code, I was unsure: What was expected of me? What exactly was I supposed to check? And, most importantly, wouldn't I make myself unpopular by pointing out others' mistakes? In my presentation, I will describe what I have learned since then. Using real examples, I’ll point out what you should look for when reviewing code (e.g. readability, redundancy, files & data), which tools you can use (e.g. gitlab runner, black, mypy) and how to stay friends while being brutally honest with each other :-) By the way: The examples of code bugs are not only from my colleagues. After all, my own code is constantly reviewed and fixed by others. And yes, I admit, it hurts every single time… |
|
|
Modeling mental hops from one word to the next with NLP and Python
2023-07-11 · 19:20
Tamara Atanasoska
– Open Source Software Engineer
@ :probably..
How would you model the mental hops that lead from one word to the next? And how about when instead of a word, the starting point are concepts grounded explicitly or implicitly in an image? These questions, and more, were the topic of my latest research project. Working to automatically generate image-term pairs for an image-grounded, collaborative Wordle game, I looked for combinations that spark the desired type of dialogue - illuminating the participants' decision-making. The project fits the broader efforts toward natural language explainability that Prof. Schlangen’s research group at the University of Potsdam is undertaking. We will look at the method I developed from an engineering perspective, going over all the NLP concepts composing it, and touch upon a bit of linguistics theory too. Level: Beginner to the domain (already familiar with Python) |
|