talk-data.com
People (574 results)
See all 574 →Companies (2 results)
Activities & events
| Title & Speakers | Event |
|---|---|
|
AI Builders Amsterdam :: Pizza, Demos & Networking (paid event)
2026-01-29 · 16:30
🎟️ Get tickets: https://lu.ma/ai-builders 🎟️ ☝️This is a paid meetup (€20 - €10), Luma ticket is required! Join our Monthly AI meetup Practical Demos & Technical Talks about building with LLMs and any Gen-AI model. :: FOR WHO :: ✅ Anyone actively building with Generative AI ✅ Devs, Product peeps, Data lovers, ML engineers, Founders ⚠️Technical LLM knowledge required!* :: FORMAT :: 💻 ⚡️ Speed Demos (10 min): Builders sharing real-world AI solutions including their breakthrough code, diagrams and prompts ! 🎤 🦄 Pioneer Talks (20 min): Inspiring talk or demo from emerging Gen-AI leaders in Europe or Silicon Valley 🤝🍕🍻 Fun Vibes: Lots of time to connect with other builders over some yummy pizza & drinks. :: AGENDA :: 17:30 🤝 Drinks & Networking 18:00 🍕 Pizza (be early!) 18:30 🎤 🦄 Pioneer Talk (20m) --- Break --- 19:30 💻 ⚡️ Demos (4 x 10m) 20.10 🍻 Drinks & Networking 21.00 End :: FAQ :: • What's AI Builders? We're a self-organizing nonprofit community of 3000+ AI Nerds in Europe. Yes.. we're building our own AI CEO. • Why do I need to pay? 1). So we know how many people will come (max capacity of the space and reduce food waste) 2). Sponsor money doesn't cover all of our costs yet. • Can I get a free ticket? Can I volunteer as co-host? Message Cristian (+31636420602) if we still need co-hosts or request a free ticket. Co-hosts arrive 1.5h early and help set-up the event or welcome people. • *I'm not technical. Can I come? Yes, but to enjoy the meetup, we recommend to learn about these LLM Concepts: Multimodal, Vector Embeddings, RAG, Chaining, Structured Output, Function Calling, API calls, Knowledge Graphs, Reinforcement Learning, Fine-tuning, Agents. Additionally: Computer Vision, Diffusion Models, DevOps, MLOps. • Why go to AI Meetups?
🎟️ Get tickets: https://lu.ma/ai-builders 🎟️ ☝️This is a paid meetup (€20 - €10), Luma ticket is required! |
AI Builders Amsterdam :: Pizza, Demos & Networking (paid event)
|
|
Weekly Meeten en Drinken
2026-01-28 · 18:00
Ahoy, fellow Appsterdammers! If you feel like it's time to start slowly getting used to being social again, maybe you'd like to warm up with your fellow makers, nerds, and other technology enthusiasts, who might better understand your specific interests and social style. Join us in the cozy cafe where Appsterdam was born, at an earlier than usual time, to enjoy their legendary terrace and its ongoing attempts to comply with an ever-shifting landscape of rules — something many App Makers will understand! Yes, even after ten long years and a pandemic, this is still the place for you to meet your fellow Appsterdammers, get to know members of the community, meet a potential business partner, bring your app to share with others, exchange ideas with like-minded people... No talks are ever scheduled and everyone is always welcome to join. Be sure to ask the waitress/waiter about their seasonal beers and grab a snack to share with your friends. All Appsterdammers get 10% off the entire bill, so order whatever you want to eat/drink. Then sit back and enjoy the company of your peers. The day is always Wednesday. The place is always Café Bax. The time, for now, is 19:00. Please note: It's a very casual event so people often don't RSVP or show up exactly on time. But don't fear, you can always find Appsterdammers at this event! Not sure you're in the right place? Just ask the staff! |
Weekly Meeten en Drinken
|
|
Meetup @ Axxes
2026-01-28 · 16:30
2026 is here and we continue with hosting the best AWS meetups in The Netherlands, to kick off this year we are hosting a meetup together with Axxes. We have amazing talks lined up for this event, so make sure to register yourself soon. Information
Agenda 17:30 - Food 🍴 18:30 - Rob Kenis - No more long lived credentials 19:00 - Amer Grgic - Kiro, Agentic AI development environment from prototype to production 19:00 - 🚰 Break 19:15 - Yannick van Rooyen & Joeri Malmberg - AWS Platform Engineering at Europe's Largest Tendering Platform 19:45 - Drinks 🍻 & Networking No more long lived credentials When connecting AWS and other services, we still see the issue of using long lived credentials for authentication. In this talk, we will solve the issue using AWS IAM OIDC providers and IAM outbound identity federation. Kiro, Agentic AI development environment from prototype to production In this talk, we'll discuss Kiro - an agentic AI development environment that seamlessly takes your projects from prototype to production. It's designed to streamline the AI development lifecycle, making the transition from experimental code to production-ready systems smoother than ever. Kiro isn't just another development tool - it's your AI project's companion from concept to deployment. AWS Platform Engineering at Europe's Largest Tendering Platform Building an AWS platform for Europe's largest tendering system means supporting a constantly growing number of workloads, teams, and architectural styles—from legacy solutions inherited through acquisitions to modern, event-driven and serverless systems. It often feels like fixing the plane while flying it. In this talk, we'll share how we built and evolved our AWS platform to reduce complexity for developers, provide a safe foundation for change, and enable all kinds of workloads to land and scale on the same platform—while keeping delivery fast. |
Meetup @ Axxes
|
|
AI Engineering: Skill Stack, Agents, LLMOps, and How to Ship AI Products
2026-01-26 · 11:30
Shipping real AI products is now one of the most in-demand engineering skills, but most teams still get stuck turning prototypes into something that actually works. In this podcast, AI engineer and bestselling author Paul Iusztin breaks down the full AI engineering skill stack:
We’ll also go beyond the code. Paul will share how he structures his work, teaching, writing, and professional growth, and how he uses AI tools to stay focused, productive, and consistent. Join us live if you want a straightforward look at the technical and personal side of modern AI engineering. About the Speaker: Paul Iusztin is an AI engineer committed to helping developers create fully functional, production-grade AI products. He is the author of the bestselling "LLM Engineer’s Handbook," leads the Agentic AI Engineering course, and is a founding AI engineer at a startup based in San Francisco. He also Decoding AI Magazine, where he assists engineers in moving beyond the proof-of-concept stage to build more effective AI systems. With over ten years of experience, Paul teaches comprehensive AI engineering, covering everything from data gathering to deployment, monitoring, and evaluation. He emphasizes robust software practices, infrastructure, and principles that are reliable in a world increasingly influenced by AI coding tools. Join our Slack: https://datatalks.club/slack.html |
AI Engineering: Skill Stack, Agents, LLMOps, and How to Ship AI Products
|
|
Linux Repair Café
2026-01-24 · 13:00
Give your still good laptop a second life instead of ditching it because Windows 11 says it is no good... Or you heard Linux is cool. If you have a backup of your data, that is great, else indicate that you do not, so we can help you. Geef je laptop die het nog prima doet een tweede kans in plaats van hem af te schrijven omdat Window 11 dat zegt... Of omdat je hoorde dat Linux gaaf is. Als je al een backup van je data hebt gemaakt is dat perfect. Of verwacht je dat wij dat doen? In dat geval graag aangeven hoe vol je harde schijf nu is... |
Linux Repair Café
|
|
Weekly Meeten en Drinken
2026-01-21 · 18:00
Ahoy, fellow Appsterdammers! If you feel like it's time to start slowly getting used to being social again, maybe you'd like to warm up with your fellow makers, nerds, and other technology enthusiasts, who might better understand your specific interests and social style. Join us in the cozy cafe where Appsterdam was born, at an earlier than usual time, to enjoy their legendary terrace and its ongoing attempts to comply with an ever-shifting landscape of rules — something many App Makers will understand! Yes, even after ten long years and a pandemic, this is still the place for you to meet your fellow Appsterdammers, get to know members of the community, meet a potential business partner, bring your app to share with others, exchange ideas with like-minded people... No talks are ever scheduled and everyone is always welcome to join. Be sure to ask the waitress/waiter about their seasonal beers and grab a snack to share with your friends. All Appsterdammers get 10% off the entire bill, so order whatever you want to eat/drink. Then sit back and enjoy the company of your peers. The day is always Wednesday. The place is always Café Bax. The time, for now, is 19:00. Please note: It's a very casual event so people often don't RSVP or show up exactly on time. But don't fear, you can always find Appsterdammers at this event! Not sure you're in the right place? Just ask the staff! |
Weekly Meeten en Drinken
|
|
LLMs are powerful, but they still hallucinate facts, especially when asked about entities, relationships, or claims that require up-to-date or structured knowledge. In this hands-on workshop, we'll explore how to use Wikidata as a grounding and fact-checking layer for LLMs to reduce hallucinations and make AI systems more reliable. We'll start with a short introduction to Wikidata and then set up the Wikidata MCP so an LLM can retrieve and verify facts rather than relying solely on its internal memory. This already provides a practical way to ground LLM outputs in verifiable data. From there, we’ll go beyond LLM-only approaches and build a small experimental fact-checking pipeline. The system combines semantic retrieval, LLM-based reranking, and natural language inference (NLI) to validate claims against evidence in a more controlled and interpretable way. This workshop focuses on evidence-driven verification pipelines that make LLM's reasoning steps explicit and easier to inspect, debug, and improve. What we'll cover:
What you'll leave with By the end of the workshop, you'll be able to:
About the speaker: Philippe Saadé is the AI/ML project manager at Wikimedia Deutschland. His current work focuses on making Wikidata accessible to AI application with projects like the Wikidata vector database and the Wikidata Model Context Protocol. Join our Slack: https://datatalks.club/slack.html This event is sponsored by Wikimedia |
How to Reduce LLM Hallucinations with Wikidata: Hands-On Fact-Checking Using MCP
|
|
Your Cloud Region Will Fail. Are You Ready?
2026-01-18 · 16:00
Cloud providers offer high availability, but regional failures still happen, and when they do, single-region architectures collapse. In this session, we will break down why regional outages are inevitable, what they really mean for production systems, and how to design Disaster Recovery that actually works when things go wrong. You’ll learn:
Who should attend: Developers, DevOps engineers, Software and Cloud Architects, Tech Leads, CTOs Language: English About the lecturer: Eran Greenbaum is a highly experienced technology professional with more than 15 years of experience across software development, DevOps practices, data architecture, and cloud engineering. Currently consulting as a freelancer, Eran’s primary focus is always on delivering practical, hands-on solutions driven by his genuine love for technology. |
Your Cloud Region Will Fail. Are You Ready?
|
|
Your Cloud Region Will Fail. Are You Ready?
2026-01-18 · 16:00
Cloud providers offer high availability, but regional failures still happen, and when they do, single-region architectures collapse. In this session, we will break down why regional outages are inevitable, what they really mean for production systems, and how to design Disaster Recovery that actually works when things go wrong. You’ll learn:
Who should attend: Developers, DevOps engineers, Software and Cloud Architects, Tech Leads, CTOs Language: English About the lecturer: Eran Greenbaum is a highly experienced technology professional with more than 15 years of experience across software development, DevOps practices, data architecture, and cloud engineering. Currently consulting as a freelancer, Eran’s primary focus is always on delivering practical, hands-on solutions driven by his genuine love for technology. |
Your Cloud Region Will Fail. Are You Ready?
|
|
The New Testament: The Letters of Paul, Part II (World History & Theology)
2026-01-17 · 17:00
Join a thought-provoking exploration of early Christianity through the letters of Paul in this live online class. You’ll discover how his writings shaped not only the foundation of Christian faith but also influenced Western philosophy, culture, and moral thought. This session provides an engaging introduction to New Testament studies, blending theology and history for learners who want to understand how faith shaped civilization. We’ll dive deep into the social, political, and spiritual context of Paul’s letters, exploring his impact on leadership, community, and ethics. You’ll learn how his words transformed early Christian thought and why his ideas still guide discussions on belief, purpose, and human connection today. Each session encourages reflection, open dialogue, and a deeper appreciation of how theology and philosophy intersect. You’ll also uncover how Paul’s writings became a bridge between ancient history and modern spirituality. From his letters to the Romans and Corinthians to his messages about grace, love, and perseverance, you’ll gain insight into how his teachings continue to inspire change and reflection in the modern world. If you’re interested in biblical studies, theology, or understanding how religion shapes culture and human behavior, this class offers a rich and inspiring learning experience. You’ll leave with a renewed sense of curiosity, historical understanding, and a deeper respect for how Paul’s vision continues to influence faith and society today. The class will take place here - https://www.passion-class.com/en/join-sample-passionclass Browse our PassionClasses here - https://www.passion-class.com/en |
The New Testament: The Letters of Paul, Part II (World History & Theology)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Vibe Coding für Agile Coaches – Wenn KI mehr kann als nur chatten
2026-01-15 · 17:00
Liebe Agile Co-Learning Community, ihr kennt uns vielleicht noch von den regelmäßigen Co-Learning-Sessions. Die Gruppe war eine Weile still – aber das Thema Vibe Coding hat uns so gepackt, dass wir es mit euch teilen wollen. Vielleicht der Anfang von etwas Neuem? Viele von euch nutzen bereits ChatGPT oder andere KI-Tools im Alltag: für Texte, Zusammenfassungen, Ideen. Aber habt ihr schon mal erlebt, wie es ist, wenn die KI nicht nur antwortet, sondern für euch baut? Vibe Coding bedeutet: Du beschreibst in natürlicher Sprache, was du brauchst – und die KI erstellt funktionierenden Code. Keine Programmierkenntnisse nötig. Du "vibst" mit der KI, bis das Ergebnis passt. Was hat das mit Agile Coaching zu tun? Mehr als man denkt. Stellt euch vor:
Nicht um euch zu ersetzen – sondern um mehr Zeit für das zu haben, was zählt: die Arbeit mit Menschen. Der Workshop Zusammen mit meinem erfahrenen Kollegen Sebastian möchte ich euch in einem kostenlosen Online-Workshop zeigen, was mit Vibe Coding möglich ist. Wir bringen konkrete Use Cases mit, probieren gemeinsam aus und sind neugierig auf eure Ideen. Das Ganze im Co-Learning-Spirit: Kein Frontalunterricht, sondern Austausch auf Augenhöhe. Für wen ist das? Für alle, die neugierig sind – egal ob erfahrene Agile Coaches, Scrum Master, Produktmenschen oder einfach Interessierte. Programmierkenntnisse brauchst du keine. Das Meetup findet über Zoom statt und ist kostenlos. Anmeldung wie gewohnt direkt hier. Wir freuen uns, wenn ihr dabei seid – und wenn diese Gruppe wieder ein bisschen lebendiger wird. Benjamin und Sebastian |
Vibe Coding für Agile Coaches – Wenn KI mehr kann als nur chatten
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Jan 15 - Best of NeurIPS (Day 2)
2026-01-15 · 17:00
Welcome to day two of the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. Time and Location Jan 15, 2026 9:00-11:00 AM Pacific Online. Register for the Zoom! Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training Diffusion models have achieved impressive results across many generative tasks, yet the mechanisms that prevent memorization and enable generalization remain unclear. In this talk, I will focus on how training dynamics shape the transition from generalization to memorization. Our experiments and theory reveal two key timescales: an early time when high-quality generation emerges and a later one when memorization begins. Notably, the memorization timescale grows linearly with the size of the training set, while the generalization timescale stays constant, creating an increasingly wide window where models generalize well. These results highlight an implicit dynamical regularization that helps diffusion models avoid memorization even in highly overparameterized regimes. About the Author Raphaël Urfin is a PhD student at École Normale Supérieure – PSL in Paris, supervised by Giulio Biroli (ENS) and Marc Mézard (Bocconi University). His work focuses on applying ideas and tools of statistical physics to better understand diffusion models and their generalization properties. Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are changing. Indeed, some 90% Earth’s species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training – the problem of open-set recognition (OSR) – limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery. About the Speaker Yuyan Chen is a PhD student in Computer Science at McGill University and Mila - Quebec AI Institute, supervised by Prof. David Rolnick. My research focuses on machine learning for biodiversity monitoring. GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions. About the Speaker Sayan Deb Sarkar is a 2nd-year PhD student at Stanford University in the Gradient Spaces Group, advised by Prof. Iro Armeni, part of the Stanford Vision Lab (SVL). His research interests are on multimodal 3D scene understanding and interactive editing. Past summer, he interned with the Microsoft Spatial AI Lab, hosted by Prof. Marc Pollefeys, working on efficient video understanding in spatial context. Before starting PhD, he was a CS master student at ETH Zürich, in the Computer Vision and Geometry Group (CVG), working on aligning real-world 3D environments from multi-modal data. In the past, he has been a Research Intern at Qualcomm XR labs, Computer Vision Engineer at Mercedes Benz R & D and Research Engineer at ICG, TU Graz. Website: https://sayands.github.io/ HouseLayout3D: A Benchmark and Baseline Method for 3D Layout Estimation in the Wild Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. About the Speaker Valentin Bieri is a Machine Learning Engineer and Researcher specializing in the intersection of 3D Computer Vision and Natural Language Processing. Building on his applied research in SLAM and Vision-Language Models at ETH Zurich, he now develops AI agents for manufacturing at EthonAI. |
Jan 15 - Best of NeurIPS (Day 2)
|
|
Applied AI: Navigating Legacy Systems and Building Agentic Workflows
2026-01-15 · 16:45
For our first meetup of 2026, we're bringing you two deeply technical stories from the front lines of applied AI, together with AI Native Netherlands. We'll hear how the ANWB navigates the challenges of imperfect data in a legacy organization, and then dive into a practical guide for building production-grade AI agentic workflows with Elastic. We’ll cover:
Speakers 1: Yke Rusticus & David Brummer (ANWB) Yke is a data engineer at ANWB with a background in astronomy and artificial intelligence. In the industry, he learned that AI models and algorithms often do not get past the experimentation phase, leading him to specialise in MLOps to bridge the gap between experimentation and production. As a professional in this field, Yke has developed ML platforms and use cases across different cloud providers, and is passionate about sharing his knowledge through tutorials and trainings. David is a self-acclaimed “not your typical Data Scientist” who loves analogue photography, vegan food, dogs, and holds an unofficial PhD in thrifting and sourcing second-hand pearls. With a background in growth hacking and experience in the digital marketing trenches of a startup, a scale-up, and a digital agency, he now brings together lean startup thinking, marketing know-how, and sales pitches, blending it all with a passion for creativity and tech at the ANWB. As a bridge between business and data, David focuses on building AI solutions that don’t just work, but actually get used. Talk: How AI is helping you back on the road We learn at school what AI can do when the data is perfect. We learn at conferences what AI can do when the environment is perfect. In this talk, you'll learn what AI can do when neither is perfect. This story is about the process of overcoming these challenges in an organisation that has been around since the invention of the bike. We'll balance the technical aspect of these solutions with the human aspect throughout the talk. Because in the end, it's not actually AI helping you back on the road, it's people. Speaker 2: Hans Heerooms (Elastic) Hans Heerooms is a Senior Solutions Architect at Elastic. He has worked in various roles, but always with one objective: helping organisations to get the most out of their data with the least amount of effort. His current role at Elastic is all about supporting Elastic’s customers to help them evolve from data driven decisions to AI guided workflows. Talk: Building Production-Grade AI Agentic Workflows with Elastic This talk tells and shows how Elastic Agent Builder can help to build and implement agentic workflows. It addresses the complexity of traditional development by integrating all necessary components—LLM orchestration, vector database, tracing, and security—directly into the Elasticsearch Search AI Platform. This talk will show you how to build custom agents, declare and assign tools, and start conversations with your data. Agenda: 17:45 — Arrival, food & drinks 18:30 — Talk #1 \| Yke & David (ANWB) 19:15 — Short break 19:30 — Talk #2 \| Hans Heerooms (Elastic) 20:15 — Open conversation, networking & more drinks 21:00 — Wrapping up Please note that the main door will close at 18.00. You will still be able to enter our office, but we might ask you to wait a little bit while we come down to open the door for you. What to bring: Just curiosity and questions. If you're working on MLOps, applied AI, or building agentic workflows, we’d love to hear your thoughts. Who this is for: Data scientists, AI/ML engineers, data engineers, MLOps specialists, SREs, architects, and engineering leaders focused on building and using real-world AI solutions. Where to find us: Elastic's office in Amsterdam Keizersgracht 281, 1016 ED Amsterdam |
Applied AI: Navigating Legacy Systems and Building Agentic Workflows
|