talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (2 results)

Activities & events

Title & Speakers Event
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS
Jan 14 - Best of NeurIPS 2026-01-14 · 17:00

Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

Jan 14, 2025 9 AM Pacific Online. Register for the Zoom!

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

About the Speaker

Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

About the Speaker

Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

Nested Learning: The Illusion of Deep Learning Architectures

We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

About the Speaker

Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

Are VLM Explanations Faithful? A Counterfactual Testing Approach

VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

About the Speaker

Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

Jan 14 - Best of NeurIPS

Pre-Registration is REQUIRED. RSVP here - https://hubs.li/Q03Y6Txy0

Speaker: Sheamus McGovern\, Founder of ODSC AI \| Venture Partner/Head of AI at Cortical Ventures

​The age of AI is here, and it’s reshaping nearly every professional role. Are you a mid-career professional in data, analytics, software, or management wondering how your job will evolve—or how to proactively position yourself for the next phase of your career?

This session is designed to cut through the hype and provide clear, actionable insights on adapting your professional path in an AI-enhanced workplace.

​You will gain a strategic overview of the skills, mindsets, and concrete steps needed to remain relevant and advance your career over the next 1–3 years.

What You Will Learn: ​**- The New Role Landscape:** Understand how AI is specifically reshaping roles in data, software, and technical leadership—and identify emerging, AI-centric job categories.

​**- The 2025–2027 Skill Priority Checklist:** Gain clarity on the essential technical and non-technical skills you need to prioritize for advancement.

​**- Your Actionable Next Steps:** Learn the key components of building a personal roadmap to reposition your career for success in an AI-driven environment.

​**- What Leaders Value:** Get a sneak peek into the insights hiring managers and senior practitioners value most in AI-aligned portfolios and career narratives.

​Don't just react to change—define your future. Reserve your spot now for the essential guidance you need to secure your professional edge.

Useful Links

WEBINAR "AI Career Transition Strategy"

Pre-Registration is REQUIRED. RSVP here - https://hubs.li/Q03Y6Txy0

Speaker: Sheamus McGovern\, Founder of ODSC AI \| Venture Partner/Head of AI at Cortical Ventures

​The age of AI is here, and it’s reshaping nearly every professional role. Are you a mid-career professional in data, analytics, software, or management wondering how your job will evolve—or how to proactively position yourself for the next phase of your career?

This session is designed to cut through the hype and provide clear, actionable insights on adapting your professional path in an AI-enhanced workplace.

​You will gain a strategic overview of the skills, mindsets, and concrete steps needed to remain relevant and advance your career over the next 1–3 years.

What You Will Learn: ​**- The New Role Landscape:** Understand how AI is specifically reshaping roles in data, software, and technical leadership—and identify emerging, AI-centric job categories.

​**- The 2025–2027 Skill Priority Checklist:** Gain clarity on the essential technical and non-technical skills you need to prioritize for advancement.

​**- Your Actionable Next Steps:** Learn the key components of building a personal roadmap to reposition your career for success in an AI-driven environment.

​**- What Leaders Value:** Get a sneak peek into the insights hiring managers and senior practitioners value most in AI-aligned portfolios and career narratives.

​Don't just react to change—define your future. Reserve your spot now for the essential guidance you need to secure your professional edge.

Useful Links

WEBINAR "AI Career Transition Strategy"

Pre-Registration is REQUIRED. RSVP here - https://hubs.li/Q03Y6Txy0

Speaker: Sheamus McGovern\, Founder of ODSC AI \| Venture Partner/Head of AI at Cortical Ventures

​The age of AI is here, and it’s reshaping nearly every professional role. Are you a mid-career professional in data, analytics, software, or management wondering how your job will evolve—or how to proactively position yourself for the next phase of your career?

This session is designed to cut through the hype and provide clear, actionable insights on adapting your professional path in an AI-enhanced workplace.

​You will gain a strategic overview of the skills, mindsets, and concrete steps needed to remain relevant and advance your career over the next 1–3 years.

What You Will Learn: ​**- The New Role Landscape:** Understand how AI is specifically reshaping roles in data, software, and technical leadership—and identify emerging, AI-centric job categories.

​**- The 2025–2027 Skill Priority Checklist:** Gain clarity on the essential technical and non-technical skills you need to prioritize for advancement.

​**- Your Actionable Next Steps:** Learn the key components of building a personal roadmap to reposition your career for success in an AI-driven environment.

​**- What Leaders Value:** Get a sneak peek into the insights hiring managers and senior practitioners value most in AI-aligned portfolios and career narratives.

​Don't just react to change—define your future. Reserve your spot now for the essential guidance you need to secure your professional edge.

Useful Links

WEBINAR "AI Career Transition Strategy"

Pre-Registration is REQUIRED. RSVP here - https://hubs.li/Q03Y6Txy0

Speaker: Sheamus McGovern\, Founder of ODSC AI \| Venture Partner/Head of AI at Cortical Ventures

​The age of AI is here, and it’s reshaping nearly every professional role. Are you a mid-career professional in data, analytics, software, or management wondering how your job will evolve—or how to proactively position yourself for the next phase of your career?

This session is designed to cut through the hype and provide clear, actionable insights on adapting your professional path in an AI-enhanced workplace.

​You will gain a strategic overview of the skills, mindsets, and concrete steps needed to remain relevant and advance your career over the next 1–3 years.

What You Will Learn: ​**- The New Role Landscape:** Understand how AI is specifically reshaping roles in data, software, and technical leadership—and identify emerging, AI-centric job categories.

​**- The 2025–2027 Skill Priority Checklist:** Gain clarity on the essential technical and non-technical skills you need to prioritize for advancement.

​**- Your Actionable Next Steps:** Learn the key components of building a personal roadmap to reposition your career for success in an AI-driven environment.

​**- What Leaders Value:** Get a sneak peek into the insights hiring managers and senior practitioners value most in AI-aligned portfolios and career narratives.

​Don't just react to change—define your future. Reserve your spot now for the essential guidance you need to secure your professional edge.

Useful Links

WEBINAR "AI Career Transition Strategy"

Pre-Registration is REQUIRED. RSVP here - https://hubs.li/Q03Y6Txy0

Speaker: Sheamus McGovern\, Founder of ODSC AI \| Venture Partner/Head of AI at Cortical Ventures

​The age of AI is here, and it’s reshaping nearly every professional role. Are you a mid-career professional in data, analytics, software, or management wondering how your job will evolve—or how to proactively position yourself for the next phase of your career?

This session is designed to cut through the hype and provide clear, actionable insights on adapting your professional path in an AI-enhanced workplace.

​You will gain a strategic overview of the skills, mindsets, and concrete steps needed to remain relevant and advance your career over the next 1–3 years.

What You Will Learn: ​**- The New Role Landscape:** Understand how AI is specifically reshaping roles in data, software, and technical leadership—and identify emerging, AI-centric job categories.

​**- The 2025–2027 Skill Priority Checklist:** Gain clarity on the essential technical and non-technical skills you need to prioritize for advancement.

​**- Your Actionable Next Steps:** Learn the key components of building a personal roadmap to reposition your career for success in an AI-driven environment.

​**- What Leaders Value:** Get a sneak peek into the insights hiring managers and senior practitioners value most in AI-aligned portfolios and career narratives.

​Don't just react to change—define your future. Reserve your spot now for the essential guidance you need to secure your professional edge.

Useful Links

WEBINAR "AI Career Transition Strategy"

Free Live Webinar - Can Multimodal AI Redefine Productivity in 2026?

AI is advancing at record speed and the next major shift is Multimodal AI, where systems can process text, images, audio, and more in a unified way. This breakthrough is reshaping how teams work, innovate, and make smarter decisions.

Join us for an insightful 60-minute live session to discover how multimodal AI is transforming productivity, accelerating workflows, and creating new advantages for forward-thinking organizations.

Date: Thu, Dec 18, 2025 Time: 12 PM EST \| 60 minutes Save Your Seat - Register Now

In this session, you’ll learn:

  • How multimodal AI differs from traditional AI models.
  • Real business use cases boosting operational efficiency.
  • How multimodal intelligence is reshaping productivity across industries.
  • Key skills and tools your team needs to thrive in 2026 and beyond.

Whether you're a business leader, IT professional, or AI enthusiast, this webinar will give you the insights needed to stay ahead of this rapidly evolving AI revolution.

Reserve your free spot now - limited seats available!

Can Multimodal AI Redefine Productivity in 2026?

Free Live Webinar - Can Multimodal AI Redefine Productivity in 2026?

AI is advancing at record speed and the next major shift is Multimodal AI, where systems can process text, images, audio, and more in a unified way. This breakthrough is reshaping how teams work, innovate, and make smarter decisions.

Join us for an insightful 60-minute live session to discover how multimodal AI is transforming productivity, accelerating workflows, and creating new advantages for forward-thinking organizations.

Date: Thu, Dec 18, 2025 Time: 12 PM EST \| 60 minutes Save Your Seat - Register Now

In this session, you’ll learn:

  • How multimodal AI differs from traditional AI models.
  • Real business use cases boosting operational efficiency.
  • How multimodal intelligence is reshaping productivity across industries.
  • Key skills and tools your team needs to thrive in 2026 and beyond.

Whether you're a business leader, IT professional, or AI enthusiast, this webinar will give you the insights needed to stay ahead of this rapidly evolving AI revolution.

Reserve your free spot now - limited seats available!

Can Multimodal AI Redefine Productivity in 2026?

Google SRE NYC proudly announces our last Google SRE NYC Tech Talk for 2025.

This event is co-sponsored by sentry.io. Thank you Sentry for your partnership!

Let's farewell 2025 with three amazing interactive short talks on Site Reliability and DevOps topics! As always the event will include an opportunity to mingle with the speakers and attendees over some light snacks and beverages after the talks.

The Meetup will take place on Tuesday, 16th of December 2025 at 6:00 PM at our Chelsea Markets office in NYC. The doors will open at 5:30 pm. Pls RSVP only if you're able to attend in-person, there will be no live streaming.

When RSVP'ing to this event, please enter your full name exactly as it appears on your government issued ID. You will be required to present your ID at check in.

Agenda: Paul Jaffre - Senior Developer Experience Engineer\, sentry.io One Trace to Rule Them All: Unifying Sentry Errors with OpenTelemetry tracing SREs face the challenge of operating reliable observability infrastructure while avoiding vendor lock-in from proprietary APM (Application Performance Monitoring) solutions. OpenTelemetry has become the standard for instrumenting applications, allowing teams to collect traces, metrics, and logs. But raw telemetry data isn't enough. SREs need tools to visualize, debug, and respond to production incidents quickly. Sentry now supports OTLP, enabling teams to send OpenTelemetry data directly to Sentry for analysis. This talk covers how Sentry's OTLP support works in practice: connecting frontend and backend traces across services, correlating logs with distributed traces, and using tools to identify slow queries and performance bottlenecks. We'll discuss the practical benefits for SREs, like faster incident resolution, better cross-team debugging, and the flexibility to change observability backends without re-instrumenting code. Paul’s background spans engineering, product management, UX design, and open source. He has a soft spot for dev tools and loses sleep over making things easy to understand and use. Paul has a dynamic professional background, from strategy to stability. His time at Krossover Intelligence established a strong foundation by blending Product Management with hands-on development, and he later focused on core reliability at MakerBot, where he implemented automated end-to-end testing and drove performance improvements. He then extended this expertise in stability and scale at Cypress.io, where he served as a Developer Experience Engineer, focusing on improving workflow, contribution, and usability for their widely adopted open-source community.

Thiara Ortiz - Cloud Gaming SRE Manager\, Netflix Managing Black Box Systems SREs often face ambiguity when managing black box systems (LLMs, Games, Poorly Understood Dependencies). We will discuss how Netflix monitors service health as black boxes using multiple measurement techniques to understand system behavior, aligning with the need for robust observability tools. These strategies are crucial for system reliability and user experience. By proactively identifying and resolving issues, we ensure smoother playback experience and maintain user trust, even as the platform continues to evolve and gain maturity. The principles shared within this talk can be expanded to other applications such as AI reliability in data quality and model deployments.

Thiara has worked at some of the largest internet companies in the world, Meta and Netflix. During her time at Meta, Thiara found a passion for distributed systems and bringing new hardware into production. Always curious to explore new solutions to complex problems, Thiara developed Fleet Scanner, internally known as Lemonaid, to perform memory, compute, and storage benchmarks on each Meta server in production. This service runs on over 5 million servers and continues to be utilized at Meta. Since Meta, Thiara has been working at Netflix as a Senior CDN Reliability engineer, and now, Cloud Gaming SRE Manager. When incidents occur and Netflix's systems do not behave as expected, Thiara can be found working and engaging the necessary teams to remediate these issues.

Andrew Espira - Platform and Site Reliability Engineer\, Founding Engineer kustode ML-Powered Predictive SRE: Using Behavioral Signals to Prevent Cluster Inefficiencies Before They Impact Production SREs managing ML clusters often discover resource inefficiencies and queue bottlenecks only after they've impacted production services. This talk presents a machine learning approach to predict these issues before they occur, transforming SRE from reactive firefighting to proactive system optimization. We demonstrate how to build predictive models using production cluster traces that identify two critical failure modes: (1) GPU under-utilization relative to requested resources, and (2) abnormal queue wait times that indicate impending service degradation. The SRE practitioners will learn how to extract early warning indicators from standard cluster logs, build ML models that provide actionable confidence scores for operational decisions, and take practical steps to integrate predictive analytics into existing SRE toolchains to achieve 50%+ reduction in resource waste and queue-related incidents This talk bridges the gap between traditional SRE observability and modern predictive analytics, showing how teams can evolve from reactive monitoring to intelligent, forward-looking reliability engineering" Andrew has over 8 years of experience architecting and maintaining large-scale distributed systems. He is the Founding Engineer of Kustode (kustode.com), where he develops cutting-edge reliability and observability solutions for modern infrastructure in the Insurance and health care solutions space. Currently pursuing graduate studies in Data Science at Saint Peter's University, he specializes in the intersection of reliability engineering and artificial intelligence. His research focuses on applying machine learning to operational challenges, with publications in peer-reviewed venues including ScienceDirect. He's passionate about making complex systems more predictable and maintainable through data-driven approaches. When not optimizing cluster performance or building the next generation of observability tools, Andrew enjoys contributing to open-source projects and mentoring early-career engineers in the SRE community.

Our Tech Talks series are for professional development and networking: no recruiters, sales or press please! Google is committed to providing a harassment-free and inclusive conference experience for everyone, and all participants must follow our Event Community Guidelines. The event will be photographed and video recorded.

Event space is limited! A reservation is required to attend. Reserve your spot today and share the event details with your SRE/DevOps friends 🙂

Google NY Site Reliability Engineering (SRE) Tech Talks, 16 Dec 2025
AI Community Wrap Up Party! 2025-12-16 · 18:00

December’s always a wild month, with company parties, endless socials, and too many events to count. So instead of competing for dates, we’re joining forces to throw one big, community-wide celebration to wrap up the year in style.

SIGN UP using this link - https://luma.com/2025wrapparty?tk=flS0rp

​What to expect

​A laid-back evening of great people, good drinks, and end-of-year energy. ​We’ll have food, drinks (yes, possibly a cocktail bar 🍸), and a few surprises, think lighthearted awards, games, and a chance to meet folks from across London’s AI and tech scene.

​Whether you’ve been building, learning, demoing, or just cheering from the sidelines this year, this one’s for you. Join us to celebrate what the community’s built together and toast to what’s next.


​RSVP now and join us for the final send-off of 2025’s community calendar. ​Because honestly, one great party beats ten separate ones! There are limited spots.

​Huge thank you to all our community partners

  • ​AI Native Dev
  • ​AI Demo Days
  • ​Hugging Face Meetup
  • ​LLM London
  • ​AI For The Rest of Us
  • ​JavaScript London
AI Community Wrap Up Party!

Join us at PyData Huddersfield for an exciting evening focused on the future of intelligent software systems where AI not only supports decision making but actively monitors reasons and heals itself in real time.

Modern organisations move fast. Code deploys multiple times a day. Pipelines break unexpectedly. Requirements shift constantly. To stay ahead we must build systems that can understand business goals and recover from failure autonomously.

In this session two expert speakers will explore how AI is transforming the entire software lifecycle from defining the problem correctly to building pipelines that fix issues before humans even notice.

Talks and Speakers Building Self Healing Software Pipelines By Okwuchi Nneka Uzoigwe - Software Engineer Discover how machine learning powered monitoring and intelligent agents can detect errors diagnose root causes and automatically trigger recovery actions to keep services running smoothly.

From Requirements to Reasoning Translating Business Goals into AI Solutions By Chinazor Prisca Amajuoyi: AI Business Analyst Learn how to convert high level business objectives into data driven AI systems that deliver meaningful measurable impact.

Event Details Friday 12th December 2025 5.00pm to 6.00pm BST Meltham HD9 West Yorkshire Networking and question and answer included Register on Meetup to secure your spot

Who Should Attend Engineers and DevOps practitioners Data scientists and machine learning engineers Business analysts and product managers Anyone curious about practical applications of AI in production Whether you are scaling a production system or exploring AI driven automation for the first time this session will show you what is possible today and what is coming next.

PyData Huddersfield December Edition 2025: Turning Ideas into Working AI

Join our virtual meetup to hear talks from experts on cutting-edge topics across Visual AI for Physical AI use cases.

Date, Time and Location

Dec 11, 2025 9:00-11:00 AM Pacific Online. Register for the Zoom!

From Data to Open-World Autonomous Driving

Data is key for advances in machine learning, including mobile applications like robots and autonomous cars. To ensure reliable operation, occurring scenarios must be reflected by the underlying dataset. Since the open-world environments can contain unknown scenarios and novel objects, active learning from online data collection and handling of unknowns is required. In this talk we discuss different approach to address this real world requirements.

About the Speaker

Sebastian Schmidt is a PhD student at the Data Analytics and Machine Learning group at TU Munich and part of an Industrial PhD Program with the BMW research group. His work is mainly focused on Open-world active learning and perception for autonomous vehicles.

From Raw Sensor Data to Reliable Datasets: Physical AI in Practice

Modern mobility systems rely on massive, high-quality multimodal datasets — yet real-world data is messy. Misaligned sensors, inconsistent metadata, and uneven scenario coverage can slow development and lead to costly model failures. The Physical AI Workbench, built in collaboration between Voxel51 and NVIDIA, provides an automated and scalable pipeline for auditing, reconstructing, and enriching autonomous driving datasets.

In this talk, we’ll show how FiftyOne serves as the central interface for inspecting and validating sensor alignment, scene structure, and scenario diversity, while NVIDIA Neural Reconstruction (NuRec) enables physics-aware reconstruction directly from real-world captures. We’ll highlight how these capabilities support automated dataset quality checks, reduce manual review overhead, and streamline the creation of richer datasets for model training and evaluation.

Attendees will gain insight into how Physical AI workflows help mobility teams scale, improve dataset reliability, and accelerate iteration from data capture to model deployment — without rewriting their infrastructure.

About the Speaker

Daniel Gural leads technical partnerships at Voxel51, where he’s building the Physical AI Workbench, a platform that connects real-world sensor data with realistic simulation to help engineers better understand, validate, and improve their perception systems. With a background in developer relations and computer vision engineering,

Building Smarter AV Simulation with Neural Reconstruction and World Models

This talk explores how neural reconstruction and world models are coming together to create richer, more dynamic simulation for scalable autonomous vehicle development. We’ll look at the latest releases in 3D Gaussian splatting techniques and world reasoning and generation, as well as discuss how these technologies are advancing the deployment of autonomous driving stacks that can generalize to any environment. We’ll also cover NVIDIA open models, frameworks, and data to help kickstart your own development pipelines.

About the Speaker

Katie Washabaugh is NVIDIA’s Product Marketing Manager for Autonomous Vehicle Simulation, focusing on virtual solutions for real world mobility. A former journalist at publications such as Automotive News and MarketWatch, she joined the NVIDIA team in 2018 as Automotive Content Marketing Manager. Katie holds a B.A. in public policy from the University of Michigan and lives in Detroit.

Relevance of Classical Algorithms in Modern Autonomous Driving Architectures

While modern autonomous driving systems increasingly rely on machine learning and deep neural networks, classical algorithms continue to play a foundational role in ensuring reliability, interpretability, and real-time performance. Techniques such as Kalman filtering, A* path planning, PID control, and SLAM remain integral to perception, localization, and decision-making modules. Their deterministic nature and lower computational overhead make them especially valuable in safety-critical scenarios and resource-constrained environments. This talk explores the enduring relevance of classical algorithms, their integration with learning-based methods, and their evolving scope in the context of next-generation autonomous vehicle architectures.

Prajwal Chinthoju is an Autonomous Driving Feature Development Engineer with a strong foundation in systems engineering, optimization, and intelligent mobility. I specialize in integrating classical algorithms with modern AI techniques to enhance perception, planning, and control in autonomous vehicle platforms.

Dec 11 - Visual AI for Physical AI Use Cases

We're super excited for our last meetup of 2025 - just before the holidays start, we're back in Amsterdam, and this time at AI House Amsterdam powered by Prosus, on Wednesday 10 December!

This edition will be extra interesting, since it will be about Profitability AI: Build it right. Make it fast. Keep it cheap.

AI projects don’t become business-critical overnight. They move through stages: from spark-of-an-idea prototypes to hardened, scalable systems that drive real revenue. In this meetup edition, we invite industry leaders to share the technical journeys their AI projects went through before becoming part of their core business. This evening is all about what it actually takes to build profitable AI: not just using the latest models, but creating systems that are efficient, scalable, operationally reliable, and deliver measurable value. You’ll hear how teams navigate the messy middle, from architecture choices to optimization strategies, to transform AI from a cool demo into a cost-effective production engine. Expect an honest look at real-world trade-offs, engineering challenges, and the solutions that made their AI both powerful and economical.

Excited as well?! We'd love to welcome you for a evening full of knowledge sharing, demos and of course great conversations, networking, and above all a fun evening with the PyData community!

Agenda

  • 18:00 - 19:00: Walk-in with drinks & food
  • 19:00 - 19:45: Talk 1 - Scaling Personalized Push Notifications by Floris Fok
  • 19:45 - 20:00: Short break
  • 20:00 - 20:45: Talk 2 - LLM distillation explained: Make smarter, cheaper, and deployable AI for enterprises by Mashrur Haider
  • 20:45 - 21:30: Networking + drinks & bites

Talk 1 : Scaling Personalized Push Notifications by Floris Fok

This talk explores how we productionize personalized push notifications at scale—moving from proof-of-concept to serving 130 billion tokens per day to nearly half of Brazil's population. We'll share the journey from traditional CRM systems to personalized-powered notifications, covering the data processing pipeline, key architectural decisions, and operational challenges. Learn the trade-offs we navigated between latency and personalization depth, how we achieved a cost per order under 10 cents, and practical insights into productionizing foundation models for commerce.

Floris Fok is a Senior AI Engineer at Prosus Group, specializing in Generative AI. He helped develop Europe's second foundational model, Climate GPT, and has over 4 years of NLP experience spanning technologies from BERT to DeepSeek. Floris played a role in the development of Toqan and has been utilizing it since its early days.

Talk 2 : LLM distillation explained: Make smarter, cheaper, and deployable AI for enterprises by Mashrur Haider

Running large LLMs in production is expensive, but often unnecessary. In this masterclass, Mashrur Haider breaks down how distillation, a popular post-training technique, can cut inference costs by up to 70% while maintaining enterprise-grade performance. You’ll learn how distillation compares to quantization and fine-tuning, seeing real benchmarks. Key takeaways: Distillation 101: How it works and why enterprises use it. Benchmarks: Cost savings without accuracy trade-offs. Workflow: From data prep to deployment on Nebius Token Factory. Scaling: Running distilled models in production with compliance and reliability.

Mashrur Haider is a Tech PM at Nebius AI Studio with a deep healthcare background (BSc Genetics, Stony Brook; MSc Bioinformatics & ML, University of Amsterdam). He’s researched at Netherlands Cancer Institute, worked in Advanced R&D at Philips IGT Systems, and operated a VC-backed techbio startup. At Nebius Token Factory, he translates real customer needs into scalable, user-friendly products aimed at model customisation and dedicated inference.

Directions The venue for this meetup is AI House Amsterdam located at the Prosus Global Headquarters (Gustav Mahlerplein 5, 1082 MS Amsterdam). AI House Amsterdam is conveniently located next to the train station Amsterdam Zuid (3 minutes walking).

Profitability AI: Build it right. Make it fast. Keep it cheap.