talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (1 result)

Activities & events

Title & Speakers Event
Tarek A. Atwan – author

Perform time series analysis and forecasting confidently with this Python code bank and reference manual Purchase of the print or Kindle book includes a free PDF eBook Key Features Explore up-to-date forecasting and anomaly detection techniques using statistical, machine learning, and deep learning algorithms Learn different techniques for evaluating, diagnosing, and optimizing your models Work with a variety of complex data with trends, multiple seasonal patterns, and irregularities Book Description To use time series data to your advantage, you need to be well-versed in data preparation, analysis, and forecasting. This fully updated second edition includes chapters on probabilistic models and signal processing techniques, as well as new content on transformers. Additionally, you will leverage popular libraries and their latest releases covering Pandas, Polars, Sktime, stats models, stats forecast, Darts, and Prophet for time series with new and relevant examples. You'll start by ingesting time series data from various sources and formats, and learn strategies for handling missing data, dealing with time zones and custom business days, and detecting anomalies using intuitive statistical methods. Further, you'll explore forecasting using classical statistical models (Holt-Winters, SARIMA, and VAR). Learn practical techniques for handling non-stationary data, using power transforms, ACF and PACF plots, and decomposing time series data with multiple seasonal patterns. Then we will move into more advanced topics such as building ML and DL models using TensorFlow and PyTorch, and explore probabilistic modeling techniques. In this part, you’ll also learn how to evaluate, compare, and optimize models, making sure that you finish this book well-versed in wrangling data with Python. What you will learn Understand what makes time series data different from other data Apply imputation and interpolation strategies to handle missing data Implement an array of models for univariate and multivariate time series Plot interactive time series visualizations using hvPlot Explore state-space models and the unobserved components model (UCM) Detect anomalies using statistical and machine learning methods Forecast complex time series with multiple seasonal patterns Use conformal prediction for constructing prediction intervals for time series Who this book is for This book is for data analysts, business analysts, data scientists, data engineers, and Python developers who want practical Python recipes for time series analysis and forecasting techniques. Fundamental knowledge of Python programming is a prerequisite. Prior experience working with time series data to solve business problems will also help you to better utilize and apply the different recipes in this book.

data data-science data-science-tasks statistics time-series AI/ML Pandas Polars Python PyTorch TensorFlow
O'Reilly Data Science Books

The time series machine learning community has begun adopting foundational models for forecasting and anomaly detection. These models, such as TimeGPT, MOMENT, Morai, and Chronos, offer zero-shot learning and promise to accelerate the development of AI use cases.

In this talk, we'll explore two popular foundational models, TimeGPT and MOMENT, for Time Series Anomaly Detection (TSAD). We'll specifically focus on the Novelty Detection flavor of TSAD, where we only have access to nominal (normal) data and the goal is to detect deviations from this norm.

TimeGPT and MOMENT take fundamentally different approaches to novelty detection.

• TimeGPT uses a forecasting-based method, tracking observed data against its forecasted confidence intervals. An anomaly is flagged when an observation falls sufficiently outside these intervals.

• MOMENT, an open-source model, uses a reconstruction-based approach. The model first encodes nominal data, then characterizes the reconstruction errors. During inference, it compares the test data's reconstruction error to these characterized values to identify anomalies.

We'll detail these approaches using the UCR anomaly detection dataset. The talk will highlight potential pitfalls when using these models and compare them with traditional TSAD algorithms.

This talk is geared toward data scientists interested in the nuances of applying foundational models for TSAD. No prior knowledge of time series anomaly detection or foundational models is required.

AI/ML
PyData Boston 2025

In industries like energy and retail, forecasting often requires local models when each time series has unique behavior — though training thousands of them can be overwhelming. However, training and managing thousands of such models presents scalability and operational challenges. This talk shows how we scaled local models on Databricks by leveraging the Pandas API on Spark, and shares practical lessons on storage, reuse, and scaling challenges to make this approach efficient when it’s truly needed

API Databricks Pandas Spark
PyData Eindhoven 2025

Multimodal deep learning models continue improving rapidly, but creating real-world applications that effectively leverage multiple data types remains challenging. This hands-on tutorial covers model selection, embedding storage, fine-tuning, and production deployment through two practical examples: a historical manuscript search system and flood forecasting with satellite imagery and time series data.

LLM
PyData Boston 2025
Regina M. Baker – author

EXPLORE THIS INDISPENSABLE AND COMPREHENSIVE GUIDE TO TIME SERIES ANALYSIS FOR STUDENTS AND PRACTITIONERS IN A WIDE VARIETY OF DISCIPLINES Applied Time Series Analysis for the Social Sciences: Specification, Estimation, and Inference delivers an accessible guide to time series analysis that includes both theory and practice. The coverage spans developments from ARIMA intervention models and generalized least squares to the London School of Economics (LSE) approach and vector autoregression. Designed to break difficult concepts into manageable pieces while offering plenty of examples and exercises, the author demonstrates the use of lag operator algebra throughout to provide a better understanding of dynamic specification and the connections between model specifications that appear to be more different than they are. The book is ideal for those with minimal mathematical experience, intended to follow a course in multiple regression, and includes exercises designed to build general skills such as mathematical expectation calculations to derive means and variances. Readers will also benefit from the inclusion of: A focus on social science applications and a mix of theory and detailed examples provided throughout An accompanying website with data sets and examples in Stata, SAS and R A simplified unit root testing strategy based on recent developments An examination of various uses and interpretations of lagged dependent variables and the common pitfalls students and researchers face in this area An introduction to LSE methodology such as the COMFAC critique, general-to-specific modeling, and the use of forecasting to evaluate and test models Perfect for students and professional researchers in the political sciences, public policy, sociology, and economics, Applied Time Series Analysis for the Social Sciences: Specification, Estimation, and Inference will also earn a place in the libraries of post graduate students and researchers in public health, public administration and policy, and education.

data data-science data-science-tasks statistics time-series SAS
O'Reilly Data Science Books
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Nov 20 - Best of ICCV (Day 2) 2025-11-20 · 17:00

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 20, 2025 9 AM Pacific Online. Register for the Zoom!

SGBD: Sharpness-Aware Mirror Gradient with BLIP-Based Denoising for Robust Multimodal Product Recommendation

The growing integration of computer vision and machine learning into the retail industry—both online and in physical stores—has driven the adoption of multimodal recommender systems to help users navigate increasingly complex product landscapes. These systems leverage diverse data sources, such as product images, textual descriptions, and user-generated content, to better model user preferences and item characteristics. While the fusion of multimodal data helps address issues like data sparsity and cold-start problems, it also introduces challenges such as information inconsistency, noise, and increased training instability.

In this paper, we analyze these robustness issues through the lens of flat local minima and propose a strategy that incorporates BLIP—a Vision-Language Model with strong denoising capabilities—to mitigate noise in multimodal inputs. Our method, Sharpness-Aware Mirror Gradient with BLIP-Based Denoising (SGBD), is a concise yet effective training strategy that implicitly enhances robustness during optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness across various multimodal recommendation benchmarks. SGBD offers a scalable solution for improving recommendation performance in real-world retail environments, where noisy, high-dimensional, and fast-evolving product data is the norm, making it a promising paradigm for training robust multi-modal recommender systems in retail industry.

About the Speaker

Kathy Wu holds a Ph.D. in Applied Mathematics and dual M.S. degrees in Computer Science and Quantitative Finance from the University of Southern California (USC), Los Angeles, CA, USA. At USC, she served as a course lecturer, offering ML Foundations and ML for Business Applications in the science school and business school. Her academic research spans high-dimensional statistics, deep learning, and causal inference, etc.

Kathy brings industry experience from Meta, LinkedIn, and Morgan Stanley in the Bay Area and New York City, US, where she focused on AI methodologies and real-world applications. She is currently an Applied Scientist at Amazon, within the Global Store organization, leading projects in E-Commerce Recommendation Systems, Search Engines, Multi-Modal Vision-Language Models (VLMs), and LLM/GenAI in retails.

Her work has been published in top-tier conferences including ICCV, CVPR, ICLR, SIGIR, WACV, etc. At ICCV 2025, she won the Best Paper Award in Retail Vision.

Spatial Mental Modeling from Limited Views

Can VLMs imagine the unobservable space from just a few views, like humans do? Humans form spatial mental models, as internal representations of "unseen space" to reason about layout, perspective, and motion. On our proposed MINDCUBE, we see critical gap systematically on VLMs building robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for ''what-if'' movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps.

The significant improvement comes from ''map-then-reason'' that jointly trains the model to first abstract a cognitive map and then reason upon it. By training models to construct and reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of "unobservable space".

We aim to understand why geometric concepts remain challenging for VLMs and outlining promising research directions towards fostering more robust spatial intelligence.

About the Speaker

Manling Li is an Assistant Professor at Northwestern University and Amazon Scholar. She was a postdoc at Stanford University, and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics, recognized by the MIT TR 35 Under 35, ACL Inaugural Dissertation Award Honorable Mention, ACL’24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, and NAACL'21 Best Demo Paper Award, Microsoft Research PhD Fellowship, EE CS Rising Star, etc.

Forecasting and Visualizing Air Pollution via Sky Images and VLM-Guided Generative Models

Air pollution monitoring is traditionally limited by costly sensors and sparse data coverage. Our research introduces a vision-language model framework that predicts air quality directly from real-world sky images and also simulates skies under varying pollution levels to enhance interpretability and robustness. We further develop visualization techniques to make predictions more understandable for policymakers and the public. This talk will present our methodology, key findings, and implications for sustainable urban environments.

About the Speaker

Mohammad Saleh Vahdatpour is a PhD candidate in Computer Science at Georgia State University specializing in deep learning, vision–language models, and sustainable AI systems. His research bridges generative AI, environmental monitoring, and motion perception, focusing on scalable and energy-efficient models that connect scientific innovation with real-world impact.

Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents

We present Sari Sandbox, a high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Addressing a gap in retail-specific sim environments for embodied agent training, Sari Sandbox features over 250 interactive grocery items across three store configurations, controlled via an API. It supports both virtual reality (VR) for human interaction and a vision language model (VLM)-powered embodied agent.

We also introduce SariBench, a dataset of annotated human demonstrations across varied task difficulties. Our sandbox enables embodied agents to navigate, inspect, and manipulate retail items, providing baselines against human performance. We conclude with benchmarks, performance analysis, and recommendations for enhancing realism and scalability.

About the Speakers

Emmanuel G. Maminta is a fourth-year Artificial Intelligence Ph.D. student at the Ubiquitous Computing Laboratory (UCL) in the University of the Philippines Diliman, advised by Prof. Rowel O. Atienza.

Janika Deborah B.Gajo is an undergraduate student studying for a Bachelor of Science in Computer Engineering at the University of the Philippines, Diliman.

Nov 20 - Best of ICCV (Day 2)
Marco Peixeiro – author

Make accurate time series predictions with powerful pretrained foundation models! You don’t need to spend weeks—or even months—coding and training your own models for time series forecasting. Time Series Forecasting Using Foundation Models shows you how to make accurate predictions using flexible pretrained models. In Time Series Forecasting Using Foundation Models you will discover: The inner workings of large time models Zero-shot forecasting on custom datasets Fine-tuning foundation forecasting models Evaluating large time models Time Series Forecasting Using Foundation Models teaches you how to do efficient forecasting using powerful time series models that have already been pretrained on billions of data points. You’ll appreciate the hands-on examples that show you what you can accomplish with these amazing models. Along the way, you’ll learn how time series foundation models work, how to fine-tune them, and how to use them with your own data. About the Technology Time-series forecasting is the art of analyzing historical, time-stamped data to predict future outcomes. Foundational time series models like TimeGPT and Chronos, pre-trained on billions of data points, can now effectively augment or replace painstakingly-built custom time-series models. About the Book Time Series Forecasting Using Foundation Models explores the architecture of large time models and shows you how to use them to generate fast, accurate predictions. You’ll learn to fine-tune time models on your own data, execute zero-shot probabilistic forecasting, point forecasting, and more. You’ll even find out how to reprogram an LLM into a time series forecaster—all following examples that will run on an ordinary laptop. What's Inside How large time models work Zero-shot forecasting on custom datasets Fine-tuning and evaluating foundation models About the Reader For data scientists and machine learning engineers familiar with the basics of time series forecasting theory. Examples in Python. About the Author Marco Peixeiro builds cutting-edge open-source forecasting Python libraries at Nixtla. He is the author of Time Series Forecasting in Python. Quotes Clear and hands-on, featuring both theory and easy-to-follow examples. - Eryk Lewinson, Author of Python for Finance Cookbook Bridges the gap between classical forecasting methods and the new developments in the foundational models. A fantastic resource. - Juan Orduz, PyMC Labs A foundational guide to forecasting’s next chapter. - Tyler Blume, daybreak An immensely practical introduction to forecasting using foundation models. - Stephan Kolassa, SAP Switzerland

data data-science data-science-tasks statistics time-series AI/ML LLM Python SAP
O'Reilly Data Science Books

This session will explore why and how Snowflake's unique capabilities are crucial to enable, accelerate and implement industrial IoT use cases like root cause analysis of asset failure, predictive maintenance and quality management. The session will explain the use of specific time series capabilities (e.g. asof joins, CORR & MATCH function), built-in Cortex ML functions (like anomaly detection and forecasting) and LLMs leveraging RAG to accelerate use cases for manufacturing customers.

AI/ML IoT LLM RAG Snowflake
Snowflake World Tour London

Forecasting time series can be messy, data is often missing, noisy, or full of structural changes like holidays, outliers, or evolving patterns. This talk shows how to build interpretable time series decomposition models using PyMC, a modern probabilistic programming library.

We’ll break time series into trend, seasonality, and noise components using engineered time features (e.g., Fourier and Radial Basis Functions). You’ll also learn how to model correlated series using hierarchical priors, letting multiple time series "learn from each other." As a case study, we’ll analyze Formula 1 lap time data to compare drivers and explore performance consistency using Bayesian posteriors.

This is a hands-on, code-first talk for data scientists, ML engineers, and researchers curious about Bayesian modeling (or Formula 1). Familiarity with Python and basic statistics is helpful, but no deep knowledge of Bayes is required.

AI/ML Python
PyData Amsterdam 2025
Matthew Watson – author , Francois Chollet – author

The bestselling book on Python deep learning, now covering generative AI, Keras 3, PyTorch, and JAX! Deep Learning with Python, Third Edition puts the power of deep learning in your hands. This new edition includes the latest Keras and TensorFlow features, generative AI models, and added coverage of PyTorch and JAX. Learn directly from the creator of Keras and step confidently into the world of deep learning with Python. In Deep Learning with Python, Third Edition you’ll discover: Deep learning from first principles The latest features of Keras 3 A primer on JAX, PyTorch, and TensorFlow Image classification and image segmentation Time series forecasting Large Language models Text classification and machine translation Text and image generation—build your own GPT and diffusion models! Scaling and tuning models With over 100,000 copies sold, Deep Learning with Python makes it possible for developers, data scientists, and machine learning enthusiasts to put deep learning into action. In this expanded and updated third edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. You'll master state-of-the-art deep learning tools and techniques, from the latest features of Keras 3 to building AI models that can generate text and images. About the Technology In less than a decade, deep learning has changed the world—twice. First, Python-based libraries like Keras, TensorFlow, and PyTorch elevated neural networks from lab experiments to high-performance production systems deployed at scale. And now, through Large Language Models and other generative AI tools, deep learning is again transforming business and society. In this new edition, Keras creator François Chollet invites you into this amazing subject in the fluid, mentoring style of a true insider. About the Book Deep Learning with Python, Third Edition makes the concepts behind deep learning and generative AI understandable and approachable. This complete rewrite of the bestselling original includes fresh chapters on transformers, building your own GPT-like LLM, and generating images with diffusion models. Each chapter introduces practical projects and code examples that build your understanding of deep learning, layer by layer. What's Inside Hands-on, code-first learning Comprehensive, from basics to generative AI Intuitive and easy math explanations Examples in Keras, PyTorch, JAX, and TensorFlow About the Reader For readers with intermediate Python skills. No previous experience with machine learning or linear algebra required. About the Authors François Chollet is the co-founder of Ndea and the creator of Keras. Matthew Watson is a software engineer at Google working on Gemini and a core maintainer of Keras. Quotes Perfect for anyone interested in learning by doing from one of the industry greats. - Anthony Goldbloom, Founder of Kaggle A sharp, deeply practical guide that teaches you how to think from first principles to build models that actually work. - Santiago Valdarrama, Founder of ml.school The most up-to-date and complete guide to deep learning you’ll find today! - Aran Komatsuzaki, EleutherAI Masterfully conveys the true essence of neural networks. A rare case in recent years of outstanding technical writing. - Salvatore Sanfilippo, Creator of Redis

data ai-ml machine-learning deep-learning AI/ML GenAI Keras LLM Python PyTorch Redis TensorFlow
O'Reilly AI & ML Books
Modern Time Series Forecasting 2025-09-19 · 19:00
Vladimir-Vadim Iurcovschi – Senior Full-Stack Data Scientist

In this tutorial, we will explore a range of feature engineering techniques for time series forecasting using popular machine learning algorithms such as XGBoost, LightGBM, and CatBoost. We'll begin by transforming time series data into a tabular format and demonstrate how to create window and lag features, as well as features that capture seasonality and trends.

We'll cover best practices for encoding categorical variables, decomposing time series, identifying outliers, and avoiding common pitfalls such as data leakage and look-ahead bias. Additionally, we’ll touch on more advanced topics like intermittency and hierarchical forecasting.

The session will also delve into cross-validation methods - specifically backtesting methods suited for time series data. We'll examine why traditional K-fold cross-validation is inappropriate for time-dependent datasets and highlight alternative approaches along with their trade-offs.

Finally, we’ll review best practices for evaluating model performance. This includes a comprehensive overview of error metrics, discussing their strengths, weaknesses, and the contexts in which each should be used.

xgboost lightgbm catboost
4 Virtual PyData Piraeus meetup: Modern Time Series Forecasting

Session Title: AI-Powered Insights: Boosting Data Performance Across Enterprise Fabric Stacks

Session Description: Artificial Intelligence (AI) is accelerating data performance across Microsoft-centric enterprise environments, transforming how organizations visualize, analyze, and act on data. This session delivers real-world insights on AI integration across four critical pillars: data visualization, AutoML, unified analytics platforms, and large language models (LLMs). Drawing from cross-sector case studies, we’ll examine how AI-augmented tools like natural language queries, smart dashboards, and anomaly detection enhance Microsoft Fabric experiences enabling faster, more inclusive decision-making. AutoML platforms are streamlining model development, reducing time-to-value by over 50% in business scenarios ranging from sales forecasting to quality control. Integrated analytics solutions minimize data prep and accelerate insight delivery by up to 70%, with Fabric’s end-to-end tooling serving as a foundation for scalable automation. We also explore how LLMs elevate accessibility through conversational interfaces and intelligent code generation improving exploration and productivity across roles. While AI unlocks tangible benefits, we’ll also address key limitations including data governance and model oversight. Attendees will gain actionable guidance on evaluating and operationalizing AI capabilities within their existing Microsoft Fabric ecosystem to drive measurable improvements in data-driven outcomes.

Be Ready to Engage: Bring a notepad – this is a content-rich session! Submit your questions beforehand via Comments. Ask questions live via comments or join the stage with camera/mic access – just reach out to us in advance if you’d like to participate on screen.

Email: [email protected]

Microsoft Fabric Thursday Expert Series - 2025

Important: This is paid conference. Purchase tickets on the event website is required for admission. get 30% off with code: AICAMP30 Purchase ticket here.

Description: The top industry event to discover the most cutting-edge advancements in AI & Machine Learning and their adoption in financial services to increase efficiency & solve challenges

WHY ATTEND Our events bring together the latest technology advancements as well as practical examples to apply AI to solve challenges in business and society. Our unique mix of academia and industry enables you to meet with AI pioneers at the forefront of research, as well as exploring real-world case studies to discover the business value of AI.

Extraordinary Speakers Discover advances in machine learning tools and techniques from the world's leading innovators across industry, academia and the financial sector. Speakers will share insights into recent breakthroughs in technical advancements and fintech applications including financial forecasting, trading & investment.

Discover Emerging Trends Learn about machine learning applications in the financial sector from algorithms to forecast financial data, to tools used in retail banking & pattern recognition in financial time series, to scaling predictive models, to wealth management, to using reinforcement learning for cross financial applications.

Expand Your Network A unique opportunity to interact with industry leaders, influential technologists, data scientists & founders leading the machine learning revolution. Learn from & connect with industry innovators sharing best practices to improve the development and application of AI in the financial sector.

Featured Speakers:

  • Pilar Roig, VP Real-Time Payments Data Product, MASTERCARD
  • Linda Su, SVP Data & Analytics, CITI BANK
  • David Cass, Managing Director, Chief Information Security Officer, GSR
  • Charles Phiri, PhD, CITP, Executive Director \| SME AI/ML Innovation, J.P. MORGAN CHASE
  • Mukund Umalkar, Director of Digital Strategy and Innovation - GenAI, ING WB
  • Philip O’Shaughnessy, Head of Architecture, METRO BANK
  • Gavin Chan, Quant Researcher, J.P. MORGAN ASSET MANAGEMENT
  • Manuj Sarpal, Chief Technology Officer, GRANITESHARES ETFS
  • Arshad Ahmed, Head of Product, AI Platform, LLOYDS BANKING GROUP
  • Beatriz Santos, Senior Data Engineer, HSBC INVESTMENT RESEARCH
  • Ankur Agrawal, Director Technical Performance Management, AXA UK
  • Dara Sosulski, Managing Director, Head of Artificial Intelligence and Model Management, MSS, HSBC
AI in Finance Summit London (External RSVP)

Curious about time series forecasting but not sure where to start?

Join us for a beginner-friendly workshop exploring how linear regression can be used for time series forecasting.

This talk explores the fundamentals of time series forecasting using linear regression models. We’ll cover how to transform time series data into a supervised learning format and utilize feature engineering techniques to model various components of the series, including trends, seasons, outliers, and structural breaks.

No prior time series experience is required—just a basic understanding of regression and a desire to learn.

🧠 What you’ll learn:

  • How to frame time series as a regression problem
  • Feature engineering techniques for time-aware data
  • Modeling trend, seasonality, outliers, and breaks
  • Practical tips for evaluation and validation

Whether you’re brushing up on regression or starting your forecasting journey, this workshop will provide hands-on insights in an accessible way.

Forecasting Time Series with Linear Regression: A Feature-Driven Approach