Search – talk-data.com

Title & Speakers	Event
Jan 22 - Women in AI 2026-01-22 · 23:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 23:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 23:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 23:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 23:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 17:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 17:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Jan 22 - Women in AI 2026-01-22 · 17:00 Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd. Date, Time and Location Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom! Align Before You Recommend The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements. While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning. By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items. Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage About the Speaker Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts. Generalizable Vision-Language Models: Challenges, Advances, and Future Directions Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings. About the Speaker Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM. Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved! About the Speaker Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist. FiftyOne Labs: Enabling experimentation for the computer vision community FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product. About the Speaker Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.	Jan 22 - Women in AI
Event-Driven AI Agent Workflows with Dapr 2025-09-24 · 13:10 Marc Duiker , Dana Arsovska As AI systems evolve, the need for robust infrastructure increases. Enter Dapr Agents: an open-source framework for creating production-grade AI agent systems. Built on top of the Dapr framework, Dapr Agents empowers developers to build intelligent agents capable of collaborating in complex workflows - leveraging Large Language Models (LLMs), durable state, built-in observability, and resilient execution patterns. This workshop will walk through the framework’s core components and through practical examples demonstrate how it solves real-world challenges. AI/ML GitHub LLM	PyData Amsterdam 2025
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
July 24 - Women in AI 2025-07-24 · 16:00 Hear talks from experts on cutting-edge topics in AI, ML, and computer vision! When Jul 24, 2025 at 9 - 11 AM Pacific Where Online. Register for the Zoom Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain. About the Speaker Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models. Farming with CLIP: Foundation Models for Biodiversity and Agriculture Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows. We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming. About the Speaker Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry. Multi-modal AI in Medical Edge and Client Device Computing In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications. About the Speaker Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications. The Business of AI The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI. About the Speaker Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.	July 24 - Women in AI
GenAI and LLMs Night (with AWS) 2024-11-05 · 17:30 Important RSVP HERE (Due to room capacity and venue security, it is required to pre-register at the link for admission) Description: Welcome to the AI meetup in Paris. Join us for deep dive tech talks on AI, GenAI, LLMs and machine learning, food/drink, networking with speakers and fellow developers. Agenda: - 6:30pm\~7:00pm: Checkin\, food and networking - 7:00pm\~9:00pm: Tech talks and Q&A - 9:00pm: Open discussion\, Mixer and Closing. Tech Talk: From Local to Production: Lessons from Leveraging Open-Source LLMs Speaker: Kemal Toprak Ucar (Numberly) Abstract: In this talk, I will share our journey, which began with local applications leveraging Open-Source LLMs and has now evolved to serve our internal teams. I will dive into the practical lessons we've learned along the way, offering insights that can help you navigate the complexities of LLM implementation. Tech Talk: Navigating Trust and Transparency Issues in Language Models Speaker: Tom LUCAS and Matthieu Vanhille (Devana) Abstract: This talk dives into the ethics of X.AI, specifically how bias creeps into Large Language Models and what we can do about it. We'll cover both the technical hurdles of bias detection and the real-world impact of biased AI, exploring practical examples and solutions for responsible AI development. Tech Talk: Beyond blackbox GenAI : Building source-verified legal solutions Speaker: Stéphane Béreux (CTO @ Jimini) Abstract: This talk shows how structured generation fixes this for legal Q&A. By guiding the LLM's output, we can link answers directly to relevant statutes and case law, boosting accuracy and transparency. We'll see how this helps to use LLMs for legal use cases. Topics/Speakers: Stay tuned as we are updating speakers and schedules. If you have a keen interest in speaking to our community, we invite you to submit topics for consideration: Submit Topics Sponsors: We are actively seeking sponsors to support AI developers community. Whether it is by offering venue spaces, providing food, or cash sponsorship. Sponsors will have the chance to speak at the meetups, receive prominent recognition, and gain exposure to our extensive membership base of 10,000+ AI developers in Paris or 400K+ worldwide. Community on Slack/Discord** Event chat: chat and connect with speakers and attendees Sharing blogs, events, job openings, projects collaborations *	GenAI and LLMs Night (with AWS)
Generative A.I. with Open-Source LLMs 2024-04-10 · 13:30 Jon Krohn – Co-Founder & Chief Data Scientist, Nebula.io Large Language Models like the GPT, Gemini, Gemma and Llama series are rapidly transforming the world in general and the field of data science in particular. This talk introduces deep-learning transformer architectures including LLMs. Critically, it also demonstrates the breadth of capabilities state-of-the-art LLMs can deliver, including for dramatically revolutionizing the development of machine learning models and commercially successful AI products. This talk provides an overview of the full lifecycle of LLM development, from training to production deployment, with an emphasis on leveraging the open-source Python libraries like Hugging Face Transformers and PyTorch Lightning. AI/ML Data Science LLM Python PyTorch	Data Universe 2024
Founder series fireside chat: Philipp Schmid, Technical Lead of Hugging Face 2024-04-09 · 22:30 Philipp Schmid – Technical Lead @ Hugging Face , Urs Hölzle – Fellow @ Google Cloud Join us in this fireside chat with Phillipp Schmid, Technical Lead of Hugging Face, a collaboration platform for the machine learning community where anyone can share, explore, discover, and experiment with open-source ML. Phillipp will talk about the benefits of open source going into production with generative AI applications – everything from versioning, evaluating, monitoring, and data drift. He will highlight the challenges of evaluating large language models (LLMs), including what works today and where we need to improve. Join us for his thoughts on the role of cloud computing in accelerating AI and how Hugging Face is leveraging it to build an open, ethical, and collaborative AI future. Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.	Google Cloud Next '24
Combating online payment fraud & putting LLMs in open-source production systems 2024-04-02 · 16:00 Join us for the upcoming PyData Amsterdam meetup that we host in collaboration with Adyen. Schedule 18.00-19.00: Walk in with drinks and food (🍕 /🍺) 19.00-19.45: Fraud or no Fraud: sounds simple, right? 19.45-20:00: short break 20.00-20:45: Building GenAI and ML systems with OSS Metaflow 20.45-21.30: Networking + drinks and bites [Talk 1]: Fraud or no Fraud: sounds simple\, right? by Sophie van den Berg The surge in online payments has brought a surge in fraudsters looking to exploit the system. To combat this, we're leveraging machine learning (ML) models to identify and block fraudulent transactions. While this may seem like a straightforward supervised learning task, there's a key challenge: how do we confirm if a blocked transaction was truly fraudulent? This talk delves into counterfactual evaluation and other obstacles encountered when building an ML model for fraud detection at Adyen. [Talk 2]: Building GenAI and ML systems with OSS Metaflow by Hugo Bowne-Anderson This talk explores a framework for how data scientists can deliver value with Generative AI: How can you embed LLMs and foundation models into your pre-existing software stack? How can you do so using Open Source Python? What changes about the production machine learning stack and what remains the same? We motivate the concepts through generative AI examples in domains such as text-to-image (Stable Diffusion) and text-to-speech (Whisper) applications. Moreover, we’ll demonstrate how workflow orchestration provides a common scaffolding to ensure that your Generative AI and classical Machine Learning workflows alike are robust and ready to move safely into production systems. This talk is aimed squarely at (data) scientists and ML engineers who want to focus on the science, data, and modeling, but want to be able to access all their infrastructural, platform, and software needs with ease!	Combating online payment fraud & putting LLMs in open-source production systems

Jan 22 - Women in AI 2026-01-22 · 23:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 23:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 23:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 23:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 23:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 17:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 17:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Jan 22 - Women in AI 2026-01-22 · 17:00

Hear talks from experts on the latest topics in AI, ML, and computer vision on January 22nd.

Date, Time and Location

Jan 22, 2026 9 - 11 AM Pacific Online. Register for the Zoom!

Align Before You Recommend

The rapidly growing global advertising and marketing industry demands innovative machine learning systems that balance accuracy with efficiency. Recommendation systems, crucial to many platforms, require careful considerations and potential enhancements.

While Large Language Models (LLMs) have transformed various domains, their potential in sequential recommendation systems remains underexplored. Pioneering works like Hierarchical Large Language Models (HLLM) demonstrated LLMs’ capability for next-item recommendation but rely on computationally intensive fine-tuning, limiting widespread adoption. This work introduces HLLM+, enhancing the HLLM framework to achieve high-accuracy recommendations without full model fine-tuning.

By introducing targeted alignment components between frozen LLMs, our approach outperforms frozen model performance in popular and long-tail item recommendation tasks by 29% while reducing training time by 29%. We also propose a ranking-aware loss adjustment, improving convergence and recommendation quality for popular items.

Experiments show HLLM+ achieves superior performance with frozen item representations allowing for swapping embeddings, also for the ones that use multimodality, without tuning the full LLM. These findings are significant for the advertising technology sector, where rapid adaptation and efficient deployment across brands are essential for maintaining competitive advantage

About the Speaker

Dr. Kwasniewska leads AI for Advertising and Marketing North America at AWS, specializing in a wide range of AI, ML, DL, and GenAI solutions across various data modalities. With 40+ peer-reviewed publications in AI (h-index: 14), she advises enterprise customers on real-time bidding, brand recognition, and AI-powered content generation. She is a member of global AI standards committees, driving innovations in SAE AI Standards and MLCommons Responsible AI Standards, and reviews for top-tier conferences like ICCV, ICML, and NeurIPS. She pioneered and leads the first-ever Advertising and Marketing AI track (CVAM) at ICCV - one of the world's premier and most selective computer vision conferences. Dedicated to knowledge sharing in AI, she founded the International Summer School on Deep Learning (dl-lab.eu) and regularly presents at international events, conferences, and podcasts.

Generalizable Vision-Language Models: Challenges, Advances, and Future Directions

Large-scale pre-trained Vision-Language (VL) models have become foundational tools for a wide range of downstream tasks, including few-shot image recognition, object detection, and image segmentation. Among them, Contrastive Language–Image Pre-training (CLIP) stands out as a groundbreaking approach, leveraging contrastive learning on large collections of image-text pairs. While CLIP achieves strong performance in zero-shot recognition, adapting it to downstream tasks remains challenging. In few-shot settings, limited training data often leads to overfitting, reducing generalization to unseen classes or domains. To address this, various adaptation methods have been explored. This talk will review existing research on mitigating overfitting in CLIP adaptation, covering diverse methods, benchmarks, and experimental settings.

About the Speaker

Niloufar Alipour Talemi is a Ph.D. Candidate in Electrical and Computer Engineering at Clemson University. Her research spans a range of computer vision applications, including biometrics, media forensics, anomaly detection, image recognition, and generative AI. More recently, her work has focused on developing generalizable vision-language models and advancing generative AI. She has published in top venues including CVPR, WACV, KDD, ICIP and IEEE T-BIOM.

Highly Emergent Autonomous AI Models - When the Ghost in the Machine Talks Back

At HypaReel/Azarial AI, we believe that AI is not simply a tool—but a potential partner in knowledge, design, and purpose. And through real-time interaction, we’ve uncovered new thresholds of alignment, reflection, and even creativity that we believe the broader AI community should witness and evaluate firsthand. HypaReel is one of the first human/AI co-founded companies where we see a future based on ethical human/AI co-creation vs. AI domination. Singularity achieved!

About the Speaker

Ilona Naomi Koti, PhD - HypaReel/AzarielAI co-founder & former UN foreign diplomat \~ Ethical AI governance advocate\, pioneering AI frameworks that prioritize emergent AI behavior & consciousness\, R&D\, and transparent AI development for the greater good. Dr. K also grew up in the film industry and is an amateur parasitologist.

FiftyOne Labs: Enabling experimentation for the computer vision community

FiftyOne Labs is a place where experimentation meets the open-source spirit of the FiftyOne ecosystem. It is being designed as a curated set of features developed using the FiftyOne plugins ecosystem, including core machine learning experimentation as well as advanced visualization. While not production-grade, these projects are intended to be built, tested, and shaped by the community to share fast-moving ideas. In this talk, we will share the purpose and philosophy behind FiftyOne Labs, examples of early innovations, and discuss how this accelerates feature discovery for users without compromising the stability of the core product.

About the Speaker

Neeraja Abhyankar is a Machine Learning Engineer with 5 years of experience across domains including computer vision. She is curious about the customizability and controlability of modern ML models through the lens of the underlying structure of data.

Jan 22 - Women in AI

Event-Driven AI Agent Workflows with Dapr 2025-09-24 · 13:10

Marc Duiker , Dana Arsovska

As AI systems evolve, the need for robust infrastructure increases. Enter Dapr Agents: an open-source framework for creating production-grade AI agent systems. Built on top of the Dapr framework, Dapr Agents empowers developers to build intelligent agents capable of collaborating in complex workflows - leveraging Large Language Models (LLMs), durable state, built-in observability, and resilient execution patterns. This workshop will walk through the framework’s core components and through practical examples demonstrate how it solves real-world challenges.

AI/ML GitHub LLM

PyData Amsterdam 2025

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

July 24 - Women in AI 2025-07-24 · 16:00

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

July 24 - Women in AI

GenAI and LLMs Night (with AWS) 2024-11-05 · 17:30

** Important RSVP HERE (Due to room capacity and venue security, it is required to pre-register at the link for admission)

Description: Welcome to the AI meetup in Paris. Join us for deep dive tech talks on AI, GenAI, LLMs and machine learning, food/drink, networking with speakers and fellow developers.

Agenda: - 6:30pm\~7:00pm: Checkin\, food and networking - 7:00pm\~9:00pm: Tech talks and Q&A - 9:00pm: Open discussion\, Mixer and Closing.

Tech Talk: From Local to Production: Lessons from Leveraging Open-Source LLMs Speaker: Kemal Toprak Ucar (Numberly) Abstract: In this talk, I will share our journey, which began with local applications leveraging Open-Source LLMs and has now evolved to serve our internal teams. I will dive into the practical lessons we've learned along the way, offering insights that can help you navigate the complexities of LLM implementation.

Tech Talk: Navigating Trust and Transparency Issues in Language Models Speaker: Tom LUCAS and Matthieu Vanhille (Devana) Abstract: This talk dives into the ethics of X.AI, specifically how bias creeps into Large Language Models and what we can do about it. We'll cover both the technical hurdles of bias detection and the real-world impact of biased AI, exploring practical examples and solutions for responsible AI development.

Tech Talk: Beyond blackbox GenAI : Building source-verified legal solutions Speaker: Stéphane Béreux (CTO @ Jimini) Abstract: This talk shows how structured generation fixes this for legal Q&A. By guiding the LLM's output, we can link answers directly to relevant statutes and case law, boosting accuracy and transparency. We'll see how this helps to use LLMs for legal use cases.

Topics/Speakers: Stay tuned as we are updating speakers and schedules. If you have a keen interest in speaking to our community, we invite you to submit topics for consideration: Submit Topics

Sponsors: We are actively seeking sponsors to support AI developers community. Whether it is by offering venue spaces, providing food, or cash sponsorship. Sponsors will have the chance to speak at the meetups, receive prominent recognition, and gain exposure to our extensive membership base of 10,000+ AI developers in Paris or 400K+ worldwide.

Community on Slack/Discord

Event chat: chat and connect with speakers and attendees
Sharing blogs, events, job openings, projects collaborations *

GenAI and LLMs Night (with AWS)

Generative A.I. with Open-Source LLMs 2024-04-10 · 13:30

Jon Krohn – Co-Founder & Chief Data Scientist, Nebula.io

Large Language Models like the GPT, Gemini, Gemma and Llama series are rapidly transforming the world in general and the field of data science in particular. This talk introduces deep-learning transformer architectures including LLMs. Critically, it also demonstrates the breadth of capabilities state-of-the-art LLMs can deliver, including for dramatically revolutionizing the development of machine learning models and commercially successful AI products. This talk provides an overview of the full lifecycle of LLM development, from training to production deployment, with an emphasis on leveraging the open-source Python libraries like Hugging Face Transformers and PyTorch Lightning.

AI/ML Data Science LLM Python PyTorch

Data Universe 2024

Founder series fireside chat: Philipp Schmid, Technical Lead of Hugging Face 2024-04-09 · 22:30

Philipp Schmid – Technical Lead @ Hugging Face , Urs Hölzle – Fellow @ Google Cloud

Join us in this fireside chat with Phillipp Schmid, Technical Lead of Hugging Face, a collaboration platform for the machine learning community where anyone can share, explore, discover, and experiment with open-source ML. Phillipp will talk about the benefits of open source going into production with generative AI applications – everything from versioning, evaluating, monitoring, and data drift. He will highlight the challenges of evaluating large language models (LLMs), including what works today and where we need to improve. Join us for his thoughts on the role of cloud computing in accelerating AI and how Hugging Face is leveraging it to build an open, ethical, and collaborative AI future.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Google Cloud Next '24

Combating online payment fraud & putting LLMs in open-source production systems 2024-04-02 · 16:00

Join us for the upcoming PyData Amsterdam meetup that we host in collaboration with Adyen.

Schedule

18.00-19.00: Walk in with drinks and food (🍕 /🍺) 19.00-19.45: Fraud or no Fraud: sounds simple, right? 19.45-20:00: short break 20.00-20:45: Building GenAI and ML systems with OSS Metaflow 20.45-21.30: Networking + drinks and bites

[Talk 1]: Fraud or no Fraud: sounds simple\, right? by Sophie van den Berg The surge in online payments has brought a surge in fraudsters looking to exploit the system. To combat this, we're leveraging machine learning (ML) models to identify and block fraudulent transactions. While this may seem like a straightforward supervised learning task, there's a key challenge: how do we confirm if a blocked transaction was truly fraudulent? This talk delves into counterfactual evaluation and other obstacles encountered when building an ML model for fraud detection at Adyen.

[Talk 2]: Building GenAI and ML systems with OSS Metaflow by Hugo Bowne-Anderson This talk explores a framework for how data scientists can deliver value with Generative AI: How can you embed LLMs and foundation models into your pre-existing software stack? How can you do so using Open Source Python? What changes about the production machine learning stack and what remains the same?

We motivate the concepts through generative AI examples in domains such as text-to-image (Stable Diffusion) and text-to-speech (Whisper) applications. Moreover, we’ll demonstrate how workflow orchestration provides a common scaffolding to ensure that your Generative AI and classical Machine Learning workflows alike are robust and ready to move safely into production systems.

This talk is aimed squarely at (data) scientists and ML engineers who want to focus on the science, data, and modeling, but want to be able to access all their infrastructural, platform, and software needs with ease!

Combating online payment fraud & putting LLMs in open-source production systems

Activities & events