Search – talk-data.com

Title & Speakers	Event
Event 🎄 PyData Berlin 2025 December Meetup 🎄 2025-12-10
A Deep Dive into Program Synthesis: Generating Logic from Examples 2025-12-10 · 18:30 Program Synthesis (PS) is the task of automatically generating logical procedures or source code from a small set of input-output examples. While LLMs and agents dominate current AI conversations, they often struggle with these kinds of precise reasoning tasks—where smaller, well-structured models for PS can succeed. In this talk, we’ll walk through the end-to-end development of an PS system, covering dataset representation using graph structures, model architectures, and tree search algorithms. The working example for this talk is the generation of procedural textures for 3D modeling, but the methodology is domain-agnostic. Participants will leave with a deeper understanding of PS, its real-world potential, and the trade-offs between different architectural approaches. The session is designed for practitioners with a solid understanding of ML concepts and some familiarity with NN architectures such as transformers and CNNs. AI/ML LLM
Building a small end-to-end product with AI: personal learnings and experiences :) 2025-12-10 · 17:00 Jean Carlo Machado – Data Science Manager @ GetYourGuide In this talk, I will walk through how building data products is evolving with modern AI development tools. I’ll take you through a small end-to-end product I built in my free time—covering everything from design, to frontend development, to data collection, and ultimately to building data science components. Here is the link to the project https://stateoftheartwithai.com/ AI/ML Data Collection Data Science

Event PyData Berlin 2025 November Meetup 2025-11-19
The Zen of Claude Code 2025-11-19 · 19:30 Vlad Gheorghe – AI Engineer @ SceneMind.ai This talk traces the evolution of AI agents from the ambitious but complex AutoGPT experiment in 2023, through Cursor's gradual "autonomy slider" approach (2023-2025), to the elegant simplicity of Claude Code in 2025. The presentation argues that we've moved from highly complex systems with multiple agents, embeddings, and cloud infrastructure that, while groundbreaking, often struggled with basic tasks, to "The Zen of Claude Code": a simple terminal-based agent that achieves excellent performance by embracing the bitter lesson. AI/ML Cloud Computing
Functional Reproducibility 2025-11-19 · 19:30 Robin Gower – Freelance data scientist Have you ever written the perfect data analysis? Has it still ran unchanged 6 months later? Can your colleagues run it without you? Just because your analysis is executable, it doesn’t mean the results are reproducible. Data ages. Libraries change. Machines differ. Servers go down. Bits rot. Entropy is inescapable. We can learn how to engineer reproducibility by drawing on techniques from functional programming and the MLOps movement. MLOps

Berlin PyData 2025 Conference Interviews 2025-09-26 · 17:10 Yashasvi Misra – Data Engineer @ Pure Storage , Igor Kvachenok – Master’s student in Data Science @ Leuphana University of Lüneburg , Selim Nowicki – Founder @ Distill Labs , Mehdi Ouazza – guest , Gülsah Durmaz – Architect & Developer At PyData Berlin, community members and industry voices highlighted how AI and data tooling are evolving across knowledge graphs, MLOps, small-model fine-tuning, explainability, and developer advocacy. Igor Kvachenok (Leuphana University / ProKube) combined knowledge graphs with LLMs for structured data extraction in the polymer industry, and noted how MLOps is shifting toward LLM-focused workflows. Selim Nowicki (Distill Labs) introduced a platform that uses knowledge distillation to fine-tune smaller models efficiently, making model specialization faster and more accessible. Gülsah Durmaz (Architect & Developer) shared her transition from architecture to coding, creating Python tools for design automation and volunteering with PyData through PyLadies. Yashasvi Misra (Pure Storage) spoke on explainable AI, stressing accountability and compliance, and shared her perspective as both a data engineer and active Python community organizer. Mehdi Ouazza (MotherDuck) reflected on developer advocacy through video, workshops, and branding, showing how creative communication boosts adoption of open-source tools like DuckDB. Igor Kvachenok Master’s student in Data Science at Leuphana University of Lüneburg, writing a thesis on LLM-enhanced data extraction for the polymer industry. Builds RDF knowledge graphs from semi-structured documents and works at ProKube on MLOps platforms powered by Kubeflow and Kubernetes. Connect: https://www.linkedin.com/in/igor-kvachenok/ Selim Nowicki Founder of Distill Labs, a startup making small-model fine-tuning simple and fast with knowledge distillation. Previously led data teams at Berlin startups like Delivery Hero, Trade Republic, and Tier Mobility. Sees parallels between today’s ML tooling and dbt’s impact on analytics. Connect: https://www.linkedin.com/in/selim-nowicki/ Gülsah Durmaz Architect turned developer, creating Python-based tools for architectural design automation with Rhino and Grasshopper. Active in PyLadies and a volunteer at PyData Berlin, she values the community for networking and learning, and aims to bring ML into architecture workflows. Connect: https://www.linkedin.com/in/gulsah-durmaz/ Yashasvi (Yashi) Misra Data Engineer at Pure Storage, community organizer with PyLadies India, PyCon India, and Women Techmakers. Advocates for inclusive spaces in tech and speaks on explainable AI, bridging her day-to-day in data engineering with her passion for ethical ML. Connect: https://www.linkedin.com/in/misrayashasvi/ Mehdi Ouazza Developer Advocate at MotherDuck, formerly a data engineer, now focused on building community and education around DuckDB. Runs popular YouTube channels ("mehdio DataTV" and "MotherDuck") and delivered a hands-on workshop at PyData Berlin. Blends technical clarity with creative storytelling. Connect: https://www.linkedin.com/in/mehd-io/ AI/ML Analytics Data Engineering Data Science dbt DuckDB Kubernetes LLM MLOps Motherduck Python	DataTalks.Club Listen
Event PyData Berlin 2025 2025-09-03
Closing Session 2025-09-03 · 13:10 Closing Session	Video
Scraping urban mobility: analysis of Berlin carsharing 2025-09-03 · 12:20 Florian König Free-floating carsharing systems struggle to balance vehicle supply and demand, which often results in inefficient fleet distribution and reduced vehicle utilization. This talk explores how data scraping can be used to model vehicle demand and user behavior, enabling targeted incentives to encourage self-balancing vehicle flows. Using information scraped from a major mobility provider over multiple months, the presentation provides spatiotemporal analyses and machine learning results to determine whether it's practically possible to offer low-friction discounts that lead to improved fleet balance. AI/ML	Video
Kubeflow pipelines meet uv 2025-09-03 · 12:20 Fabrizio Damicelli Kubeflow is a platform for building and deploying portable and scalable machine learning (ML) workflows using containers on Kubernetes-based systems. We will code together a simple Kubeflow pipeline, show how to test it locally. As a bonus, we will explore one solution to avoid dependency hell using the modern dependency management tool uv. AI/ML Kubernetes	Video
When Postgres is enough: solving document storage, pub/sub and distributed queues without more tools 2025-09-03 · 11:40 Eugen Geist When a new requirement appears, whether it's document storage, pub/sub messaging, distributed queues, or even full-text search, Postgres can often handle it without introducing more infrastructure. This talk explores how to leverage Postgres' native features like JSONB, LISTEN/NOTIFY, queueing patterns and vector extensions to build robust, scalable systems without increasing infrastructure complexity. You'll learn practical patterns that extend Postgres just far enough, keeping systems simpler, more maintainable, and easier to operate, especially in small to medium projects or freelancing setups, where Postgres often already forms a critical part of the stack. Postgres might not replace everything forever - but it can often get you much further than you think. postgresql Pub/Sub	Video
Spot the difference: 🕵️ using foundation models to monitor for change with satellite imagery 🛰️ 2025-09-03 · 11:40 Ferdinand Schenck Energy infrastructure is vulnerable to damage by erosion or third party interference, which often takes the form of unsanctioned construction. In this talk we discuss our experiences using deep learning algorithms powered by large foundation models to monitor for changes in bi-temporal very-high resolution satellite imagery.	Video
See only what you are allowed to see: Fine-Grained Authorization 2025-09-03 · 11:40 Maria Knorps Managing who can see or do what with your data is a fundamental challenge, especially as applications and data grow in complexity. Traditional role-based systems often lack the granularity needed for modern data platforms. Fine-Grained Authorization (FGA) addresses this by controlling access at the individual resource level. In this 90-minute hands-on tutorial, we will explore implementing FGA using OpenFGA, an open-source authorization engine inspired by Google's Zanzibar. Attendees will learn the core concepts of Relationship-Based Access Control (ReBAC) and get practical experience defining authorization models, writing relationship tuples, and performing authorization checks using the OpenFGA Python SDK. Bring your laptop ready to code to learn how to build secure and flexible permission systems for your data applications. Python	Video
Lunch Break 2025-09-03 · 10:30
Lunch Break 2025-09-03 · 10:30
Lunch Break 2025-09-03 · 10:30
Lunch Break 2025-09-03 · 10:30
Better docs, happier users: What we learned applying Diataxis to HoloViz libraries 2025-09-03 · 10:00 Maxime Liquet Clear documentation is crucial for the success of open-source libraries, but it’s often hard to get right. In this talk, I’ll share our experience applying the Diataxis documentation framework to improve two HoloViz ecosystem libraries, hvPlot and Panel. Attendees will come away with practical insights on applying Diataxis and strengthening documentation for their own projects.	Video
Flying Beyond Keywords: Our Aviation Semantic Search Journey 2025-09-03 · 10:00 Dat Tran – guest @ Priceloop , Dennis Schmidt In aviation, search isn’t simple—people use abbreviations, slang, and technical terms that make exact matching tricky. We started with just Postgres, aiming for something that worked. Over time, we upgraded: semantic embeddings, reranking. We tackled filter complexity, slow index builds, and embedding updates and much more. Along the way, we learned a lot about making AI search fast, accurate, and actually usable for our users. It’s been a journey—full of turbulence, but worth the landing. AI/ML postgresql	Video
How Digital David Wins Against Data Goliaths 2025-09-03 · 09:20 Pawel Herman This talk introduces a new and innovative business model supported by a network of digital activists that form a collective force for protecting humanity, enabling digitally aware users to reclaim control over their data.
Docling: Get your documents ready for gen AI 2025-09-03 · 09:20 Michele Dolfi , Christoph Auer Docling, an open source package, is rapidly becoming the de facto standard for document parsing and export in the Python community. Earning close to 30,000 GitHub in less than one year and now part of the Linux AI & Data Foundation. Docling is redefining document AI with its ease and speed of use. In this session, we’ll introduce Docling and its features, including usages with various generative AI frameworks and protocols (e.g. MCP). AI/ML GenAI GitHub Linux Python	Video
Edge of Intelligence: The State of AI in Browsers 2025-09-03 · 08:40 Johannes Kolbe API calls suck! Okay, not all of them. But building your AI features reliant on third party APIs can bring a lot of trouble. In this talk you'll learn how to use web technologies to become more independent. AI/ML API	Video

Activities & events