talk-data.com
People (147 results)
See all 147 →Activities & events
| Title & Speakers | Event |
|---|---|
|
ODSC AI East 2026 | The #1 AI Builders Conference
2026-04-28 · 13:00
THIS IS PAID EVENT. PRE-REGISTRATION IS REQUIRED. RSVP here - https://luma.com/fzypluc8 Use code - COMMUNITYEAST2026 - for an extra discount. Where the global AI community unites to build the future.Get ready to deepen your expertise, forge powerful connections, and stay at the absolute forefront of artificial intelligence. ODSC AI East is returning to Boston, MA at the Hynes Convention Center, April 28–30, for three days dedicated to practical, hands-on learning and community growth. And/OR you may join Virtual Conference. More details here - https://odsc.ai/east/ First Speakers Announcement
This is more than just a conference; it’s the essential event for data science practitioners, AI builders, technical leaders, and anyone looking to pivot into an AI-driven career. With 300+ hours of content from 250+ expert speakers, you'll gain job-ready skills and strategic insights you can implement immediately. Why You Need to Be Here: Learning and ValueTechnical Tracks: Data Engineering \| Physical AI \| AI for BioPharma & Health \| LLMs\, GenAI & RAG \| Agentic AI & Workflow Automation \| Keynotes & Industry Leadership \| Data Science & Machine Learning & MLOps \| AI Engineering & AIOps Non-Technical Tracks: AI Strategy \| AI Risk & Governance \| Agentic AI for Enterprise \| AI Products & Innovation \| AI & Future of work \| Executive Track \| AI Founder Track At ODSC AI East, we prioritize tangible value and immersive learning, ensuring you walk away certified and skilled. Our expansive agenda is packed with cutting-edge workshops and deep-dive tutorials across the most in-demand domains:
Beyond the Talks: Our Expanded Community EventsThe true value of ODSC AI East is the opportunity to connect and collaborate. Your pass unlocks a rich ecosystem of co-located events and unique networking opportunities designed for every career level:
Join us in celebrating our community's pursuit of knowledge, inclusivity, and fairness as we work together to move the world of data science forward. Ready to build better AI? Find your perfect pass and secure your spot at ODSC AI East 2026 today! Useful Links
|
ODSC AI East 2026 | The #1 AI Builders Conference
|
|
Aniket Roy
– PhD student in Computer Science
@ Johns Hopkins University
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks. |
|
|
Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting
2025-11-24 · 17:00
Alexey Kravets
– PhD student in AI
@ University of Bath
CLIP is a foundational model with transferable classification performance in the few-shot setting. Several methods have shown improved performance of CLIP using few-shot examples. However, so far, all these techniques have been benchmarked using standard few-shot datasets. We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability using few-shot examples. As most datasets have been seen by the CLIP model, the resultant setting can be termed as partially transductive. To solve this, we propose a pipeline that uses an unlearning technique to obtain true inductive baselines. In this new inductive setting, the methods show a significant drop in performance (-55% on average among 13 baselines with multiple datasets). We validate the unlearning technique using oracle baselines. An improved few-shot classification technique is proposed that consistently obtains state-of-the-art performance over 13 other recent baseline methods on a comprehensive analysis with 5880 experiments - varying the datasets, differing number of few-shot examples, unlearning setting, and with different seeds. Thus, we identify the issue with the evaluation of CLIP-based few-shot classification, provide a solution using unlearning, propose new benchmarks, and provide an improved method. |
|
|
UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields
2025-11-24 · 17:00
Fabian Perez
– computer science student
@ Universidad Industrial de Santander (UIS)
Neural Radiance Field (NeRF)-based segmentation methods focus on object semantics and rely solely on RGB data, lacking intrinsic material properties. We introduce UnMix-NeRF, a framework that integrates spectral unmixing into NeRF, enabling joint hyperspectral novel view synthesis and unsupervised material segmentation. Our method models spectral reflectance via diffuse and specular components, where a learned dictionary of global endmembers represents pure material signatures, and per-point abundances capture their distribution. For material segmentation, we use spectral signature predictions along learned endmembers, allowing unsupervised material clustering. Additionally, UnMix-NeRF enables scene editing by modifying learned endmember dictionaries for flexible material-based appearance manipulation. Extensive experiments validate our approach, demonstrating superior spectral reconstruction and material segmentation to existing methods. |
Nov 24 - Best of ICCV (Day 4)
|
|
Aniket Roy
– PhD student in Computer Science
@ Johns Hopkins University
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks. |
Nov 24 - Best of ICCV (Day 4)
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
Nov 24 - Best of ICCV (Day 4)
|
|
Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting
2025-11-24 · 17:00
Alexey Kravets
– PhD student in AI
@ University of Bath
CLIP is a foundational model with transferable classification performance in the few-shot setting. Several methods have shown improved performance of CLIP using few-shot examples. However, so far, all these techniques have been benchmarked using standard few-shot datasets. We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability using few-shot examples. As most datasets have been seen by the CLIP model, the resultant setting can be termed as partially transductive. To solve this, we propose a pipeline that uses an unlearning technique to obtain true inductive baselines. In this new inductive setting, the methods show a significant drop in performance (-55% on average among 13 baselines with multiple datasets). We validate the unlearning technique using oracle baselines. An improved few-shot classification technique is proposed that consistently obtains state-of-the-art performance over 13 other recent baseline methods on a comprehensive analysis with 5880 experiments - varying the datasets, differing number of few-shot examples, unlearning setting, and with different seeds. Thus, we identify the issue with the evaluation of CLIP-based few-shot classification, provide a solution using unlearning, propose new benchmarks, and provide an improved method. |
Nov 24 - Best of ICCV (Day 4)
|
|
Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting
2025-11-24 · 17:00
Alexey Kravets
– PhD student in AI
@ University of Bath
CLIP is a foundational model with transferable classification performance in the few-shot setting. Several methods have shown improved performance of CLIP using few-shot examples. However, so far, all these techniques have been benchmarked using standard few-shot datasets. We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability using few-shot examples. As most datasets have been seen by the CLIP model, the resultant setting can be termed as partially transductive. To solve this, we propose a pipeline that uses an unlearning technique to obtain true inductive baselines. In this new inductive setting, the methods show a significant drop in performance (-55% on average among 13 baselines with multiple datasets). We validate the unlearning technique using oracle baselines. An improved few-shot classification technique is proposed that consistently obtains state-of-the-art performance over 13 other recent baseline methods on a comprehensive analysis with 5880 experiments - varying the datasets, differing number of few-shot examples, unlearning setting, and with different seeds. Thus, we identify the issue with the evaluation of CLIP-based few-shot classification, provide a solution using unlearning, propose new benchmarks, and provide an improved method. |
Nov 24 - Best of ICCV (Day 4)
|
|
Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)
2025-11-24 · 17:00
Lennart Bastian
– PhD candidate at TU Munich's CAMP lab
@ TU Munich, CAMP Lab
Tracking and forecasting the rotation of objects is fundamental in computer vision and robotics, yet SO(3) extrapolation remains challenging as (1) sensor observations can be noisy and sparse, (2) motion patterns can be governed by complex dynamics, and (3) application settings can demand long-term forecasting. This work proposes modeling continuous-time rotational object dynamics on SO(3) using Neural Controlled Differential Equations guided by Savitzky-Golay paths. Unlike existing methods that rely on simplified motion assumptions, our method learns a general latent dynamical system of the underlying object trajectory while respecting the geometric structure of rotations. Experimental results on real-world data demonstrate compelling forecasting capabilities compared to existing approaches. |
Nov 24 - Best of ICCV (Day 4)
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding left versus right when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
Nov 24 - Best of ICCV (Day 4)
|
|
UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields
2025-11-24 · 17:00
Fabian Perez
– computer science student
@ Universidad Industrial de Santander (UIS)
Neural Radiance Field (NeRF)-based segmentation methods focus on object semantics and rely solely on RGB data, lacking intrinsic material properties. We introduce UnMix-NeRF, a framework that integrates spectral unmixing into NeRF, enabling joint hyperspectral novel view synthesis and unsupervised material segmentation. Our method models spectral reflectance via diffuse and specular components, where a learned dictionary of global endmembers represents pure material signatures, and per-point abundances capture their distribution. For material segmentation, we use spectral signature predictions along learned endmembers, allowing unsupervised material clustering. Additionally, UnMix-NeRF enables scene editing by modifying learned endmember dictionaries for flexible material-based appearance manipulation. Extensive experiments validate our approach, demonstrating superior spectral reconstruction and material segmentation to existing methods. |
Nov 24 - Best of ICCV (Day 4)
|
|
UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields
2025-11-24 · 17:00
Fabian Perez
– computer science student
@ Universidad Industrial de Santander (UIS)
Neural Radiance Field (NeRF)-based segmentation methods focus on object semantics and rely solely on RGB data, lacking intrinsic material properties. We introduce UnMix-NeRF, a framework that integrates spectral unmixing into NeRF, enabling joint hyperspectral novel view synthesis and unsupervised material segmentation. Our method models spectral reflectance via diffuse and specular components, where a learned dictionary of global endmembers represents pure material signatures, and per-point abundances capture their distribution. For material segmentation, we use spectral signature predictions along learned endmembers, allowing unsupervised material clustering. Additionally, UnMix-NeRF enables scene editing by modifying learned endmember dictionaries for flexible material-based appearance manipulation. Extensive experiments validate our approach, demonstrating superior spectral reconstruction and material segmentation to existing methods. |
Nov 24 - Best of ICCV (Day 4)
|
|
Aniket Roy
– PhD student in Computer Science
@ Johns Hopkins University
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks. |
Nov 24 - Best of ICCV (Day 4)
|
|
Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)
2025-11-24 · 17:00
Lennart Bastian
– PhD candidate at TU Munich's CAMP lab
@ TU Munich, CAMP Lab
Tracking and forecasting the rotation of objects is fundamental in computer vision and robotics, yet SO(3) extrapolation remains challenging as (1) sensor observations can be noisy and sparse, (2) motion patterns can be governed by complex dynamics, and (3) application settings can demand long-term forecasting. This work proposes modeling continuous-time rotational object dynamics on SO(3) using Neural Controlled Differential Equations guided by Savitzky-Golay paths. Unlike existing methods that rely on simplified motion assumptions, our method learns a general latent dynamical system of the underlying object trajectory while respecting the geometric structure of rotations. Experimental results on real-world data demonstrate compelling forecasting capabilities compared to existing approaches. |
Nov 24 - Best of ICCV (Day 4)
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
Nov 24 - Best of ICCV (Day 4)
|
|
Aniket Roy
– PhD student in Computer Science
@ Johns Hopkins University
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks. |
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
|
|
Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting
2025-11-24 · 17:00
Alexey Kravets
– PhD student in AI
@ University of Bath
CLIP is a foundational model with transferable classification performance in the few-shot setting. Several methods have shown improved performance of CLIP using few-shot examples. However, so far, all these techniques have been benchmarked using standard few-shot datasets. We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability using few-shot examples. As most datasets have been seen by the CLIP model, the resultant setting can be termed as partially transductive. To solve this, we propose a pipeline that uses an unlearning technique to obtain true inductive baselines. In this new inductive setting, the methods show a significant drop in performance (-55% on average among 13 baselines with multiple datasets). We validate the unlearning technique using oracle baselines. An improved few-shot classification technique is proposed that consistently obtains state-of-the-art performance over 13 other recent baseline methods on a comprehensive analysis with 5880 experiments - varying the datasets, differing number of few-shot examples, unlearning setting, and with different seeds. Thus, we identify the issue with the evaluation of CLIP-based few-shot classification, provide a solution using unlearning, propose new benchmarks, and provide an improved method. |
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
Nov 24 - Best of ICCV (Day 4)
|
|
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
2025-11-24 · 17:00
Shijie Zhou
– final-year PhD candidate
@ UCLA
Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world. |
Nov 24 - Best of ICCV (Day 4)
|