Session #17: Google AI Seminar (Virtual) 2025-05-08 · 16:00

Important: Register on the event website to receive joining link. (rsvp on meetup will NOT receive joining link).

Description: Welcome to the weekly AI virtual seminars, in collaboration with Google. Join us for deep dive tech talks on AI/ML/Data, hands-on experiences on code labs, workshops, and networking with speakers & fellow developers from all over the world.

May 8: AI Seminar (Virtual S17): Google Gemini and Vertex AI

More upcoming sessions:

April 25th: Monthly Agentic AI Hour (Virtual) - Session 4
May 14th: AI Seminar (Virtual): Build reliable agents with human control *

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers *

Session #17: Google AI Seminar (Virtual)

Session #17: Google AI Seminar (Virtual) 2025-05-08 · 16:00

Important: Register on the event website to receive joining link. (rsvp on meetup will NOT receive joining link).

Description: Welcome to the weekly AI virtual seminars, in collaboration with Google. Join us for deep dive tech talks on AI/ML/Data, hands-on experiences on code labs, workshops, and networking with speakers & fellow developers from all over the world.

May 8: AI Seminar (Virtual S17): Google Gemini and Vertex AI

More upcoming sessions:

April 25th: Monthly Agentic AI Hour (Virtual) - Session 4
May 14th: AI Seminar (Virtual): Build reliable agents with human control *

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers *

Session #17: Google AI Seminar (Virtual)

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment.

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup 2025-04-24 · 17:00

This is a virtual event.

Register for the Zoom

Towards a Multimodal AI Agent that Can See, Talk and Act

The development of multimodal AI agents marks a pivotal step toward creating systems capable of understanding, reasoning, and interacting with the world in human-like ways. Building such agents requires models that not only comprehend multi-sensory observations but also act adaptively to achieve goals within their environments. In this talk, I will present my research journey toward this grand goal across three key dimensions.

First, I will explore how to bridge the gap between core vision understanding and multimodal learning through unified frameworks at various granularities. Next, I will discuss connecting vision-language models with large language models (LLMs) to create intelligent conversational systems. Finally, I will delve into recent advancements that extend multimodal LLMs into vision-language-action models, forming the foundation for general-purpose robotics policies. To conclude, I will highlight ongoing efforts to develop agentic systems that integrate perception with action, enabling them to not only understand observations but also take meaningful actions in a single system.

Together, these lead to an aspiration of building the next generation of multimodal AI agents capable of seeing, talking, and acting across diverse scenarios in both digital and physical worlds.

About the Speaker

Jianwei Yang is a Principal Researcher at Microsoft Research (MSR), Redmond. His research focuses on the intersection of vision and multimodal learning, with an emphasis on bridging core vision tasks with language, building general-purpose and promptable multimodal models, and enabling these models to take meaningful actions in both virtual and physical environments.

ConceptAttention: Interpreting the Representations of Diffusion Transformers

Recently, diffusion transformers have taken over as the state-of-the-art model class for both image and video generation. However, similar to many existing deep learning architectures, their high-dimensional hidden representations are difficult to understand and interpret. This lack of interpretability is a barrier to their controllability and safe deployment.

We introduce ConceptAttention, an approach to interpreting the representations of diffusion transformers. Our method allows users to create rich saliency maps depicting the location and intensity of textual concepts. Our approach exposes how a diffusion model “sees” a generated image and notably requires no additional training. ConceptAttention improves upon widely used approaches like cross attention maps for isolating the location of visual concepts and even generalizes to real world (not just generated) images and video generation models!

Our work serves to improve the community’s understanding of how diffusion models represent data and has numerous potential applications, like image editing.

About the Speaker

Alec Helbling is a PhD student at Georgia Tech. His research focuses on improving the interpretability and controllability of generative models, particularly for image generation. His research is more application focused, and he has have interned at a variety of industrial research labs like Adobe Firefly, IBM Research, and NASA Jet Propulsion Lab. He also has a passion for creating explanatory videos of interesting machine learning and mathematical concepts.

RelationField: Relate Anything in Radiance Fields

Neural radiance fields recently emerged as a 3D scene representation extended by distilling open-vocabulary features from vision-language models. Current methods focus on object-centric tasks, leaving semantic relationships largely unexplored. We propose RelationField, the first method extracting inter-object relationships directly from neural radiance fields using pairs of rays for implicit relationship queries. RelationField distills relationship knowledge from multi-modal LLMs. Evaluated on open-vocabulary 3D scene graph generation and relationship-guided instance segmentation, RelationField achieves state-of-the-art performance.

About the Speaker

Sebastian Koch is a PhD student at Ulm University and Bosch Center for Artificial Intelligence. He is supervised by Timo Ropinski from Ulm University. His main research interest lies at the intersection of computer vision and robotics. The goal of his PhD is to develop 3D scene representations of the real world that are valuable for robots to navigate and solve tasks within their environment.

RGB-X Model Development: Exploring Four Channel ML Workflows

Machine Learning is rapidly becoming multimodal. With many models in Computer Vision expanding to areas like vision and 3D, one area that has also quietly been advancing rapidly is RGB-X data, such as infrared, depth, or normals. In this talk we will cover some of the leading models in this exploding field of Visual AI and show some best practices on how to work with these complex data formats!

About the Speaker

Daniel Gural is a seasoned Machine Learning Evangelist with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data. Currently serving as a valuable member of Voxel51, he takes a leading role in efforts to bridge the gap between practitioners and the necessary tools, enabling them to achieve exceptional outcomes. Daniel’s extensive experience in teaching and developing within the ML field has fueled his commitment to democratizing high-quality AI workflows for a wider audience.

April 24, 2025 - AI, Machine Learning and Computer Vision Meetup

AI Meetup (April): Agentic AI 2025-04-17 · 21:30

** Important **: Due to room capacity and building security, you must register on the event website for admission.

Description: Welcome to the GenAI meetup in New York City. Join us for deep dive tech talks on AI, GenAI, LLMs and machine learning, food/drink, networking with speakers and fellow developers.

Agenda: * 5:30pm\~6:00pm: Checkin, Food and networking * 6:00pm\~6:10pm: Welcome/community update * 6:10pm\~8:00pm: Tech talks * 8:00pm: Q&A, Open discussion

Tech Talk: Building a state-of-the-art AI web researcher Speaker: Boris Toledano (COO and Co-founder of Linkup) Abstract: In this session, we'll discuss the next-generation search infrastructure that gives AI agents seamless access to web information and hard-to-find intelligence. Traditional methods can't handle these new workflows, and legacy search engines - designed for human attention - aren't built for these emerging AI use cases. We will address: a)The power of web search for LLM-based applications; b) the need to avoid scraping of legacy search engines; c) How we're building a new category of "searcher" models; and d) What you can power with a web retrieval engine, including demos.

Tech Talk: Building a Self-Improving Agent Speaker: John Gilhuly (Arize AI) Abstract: Agents are powerful—but without feedback, they're flying blind. In this talk, we’ll walk through how to build self-improving agents by closing the loop with evaluation, experimentation, tracing, and prompt optimization. You’ll learn how to capture the right telemetry, run meaningful tests, and apply insights in a way that actually improves performance over time. Whether you’re building copilots, chatbots, or autonomous workflows, this session will give you the practical tools and architecture patterns you need to make your agents smarter—automatically.

Speakers and Topics: Stay tuned as we are updating speakers and schedules. If you have a keen interest in speaking to our community, we invite you to submit topics for consideration: Submit Topics

Sponsors: We are actively seeking sponsors to support our community. Whether it is by offering venue spaces, providing food/drink, or cash sponsor. Sponsors will not only speak at the meetups, receive prominent recognition, but also gain exposure to our extensive membership base of 20,000+ AI developers in New York and 500K+ worldwide.

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers. *

AI Meetup (April): Agentic AI

Global AI Bootcamp {Berlin} | In-person 2025-04-11 · 11:00

Global AI Bootcamp Berlin 2025

For Community, By the Community

Join us for Global AI Bootcamp Berlin 2025, a dedicated space for Tech and AI enthusiasts, developers, and professionals to learn, share, and collaborate. This event is part of a global initiative bringing together AI experts and learners to explore the latest innovations, best practices, and real-world applications of Artificial Intelligence on Microsoft Azure.

Event Details

Date: April 11, 2025
Time: 13:00 – 20:30 CET
Location: CODE University of Applied Sciences
Topics: Azure AI, Copilots, AI Upskilling, Microsoft 365, AI Ethics

Pre-recorded Keynote Session

Hear from Scott Hanselman, Guido van Rossum, Jennifer Marsman, and Sarah Bird as they discuss AI’s impact on development, Python’s role, and the importance of ethical AI.

In-person Closing Keynote

Hear from Christian Heilmann, VP of DevRel at WeAreDevelopers, as he explores “Vibe coding, creativity, craft and professionalism – are we making ourselves redundant?” – a thought-provoking session on the evolving role of developers in the age of AI.

Speakers & Sessions at Global AI Bootcamp Berlin 2025

Welcome & Announcements – Presented by ignore gravity, CODE University, and the Global AI Community Reimar Müller-Thum – CEO at CODE University \| Peter Ruppel – President at CODE University Susanne Scheerer – Partner at ignore gravity Zaid Zaim – XR / AI at ignore gravity \| Micorosoft MVP \| Global AI Berlin Chapter Lead
Tanja Wiehoff – Build Your Own Intranet Agents in Minutes M365 Consultant at Communardo Software \| Microsoft MVP
René Schulte – The Next Frontier with Multimodal, Agentic & Embodied Intelligence Head of 3D and AI Practices at Reply \| Microsoft MVP / RD
Matthias Buchhorn-Roth & Alexander Preis – Custom AI Agents, No-Code Workflow, Conversational Interfaces Founders at PX4.ai
Marc Plogas – Semantic Kernel Multi-Agent Scenarios
Christian Weyer – Semantic AI: Language & Embedding Models Hand-in-Hand CTO & Co-founder at Thinktecture \| Microsoft MVP / RD
Marius Marx – Cyber Security in Enterprise & Critical Infrastructure Co-founder & CPO at Alaris Security
Daniela Burgos - From Rock Bottom to Startup Founder: My AI-Powered Comeback Nerea Co-founder and Grace Accelerator participant
Nicole Enders – HR Companion - Build a Bot with Azure AI Foundry & Microsoft Teams Managing Consultant at CONET Solutions \| Microsoft MVP
Ragnar Heil – 5 Different Shades of Copilot Agents: From SharePoint to Azure AI Search Business Development at HanseVision \| Microsoft MVP
Torben Köhler & Jona Schwarz – Metaphrase: Context-Aware i18n Automation for Modern Dev Pipelines Founders at Metaphrase
Sebastian Rosengrün & Dr. David Rump – Compliance Meets Innovation: Navigating the AI Act Founders at AI Impact Lab
Michael Greth – MikeOnAI – Live on Site with MVP Michael Greth

What to Expect?

Tech Talks & Workshops – Learn from AI experts about the latest trends, challenges, and advancements in AI, including Copilots, Azure AI, and AI-powered applications.
Hands-On Demos – Experience real-world AI use cases, explore AI-driven tools, and experiment with Azure AI services.
Networking & Collaboration – Connect with fellow AI enthusiasts, developers, and industry professionals to exchange ideas, share insights, and collaborate on AI projects.
Speaker Sessions & Panels – Gain insights from experienced AI practitioners sharing their knowledge, experiences, and best practices in AI-driven innovation.

Agenda \| Event Schedule

13:00 – 13:30 \| Check-In & Networking
13:30 – 13:45 \| Welcome & Opening
14:00 – 15:30 \| Keynote & AI Tech Talks
15:30 – 19:00 \| Hands-On Sessions & Workshops
19:00 – 20:30 \| Closing Keynote & Networking

(Agenda subject to updates.)

Global AI Bootcamp

The Global AI Community connects over 60,000 AI enthusiasts worldwide, fostering learning, collaboration, and skill development. The Global AI Bootcamp is a free, community-driven event dedicated to exploring AI's transformative potential, with a focus on Azure AI and Copilots.

Partners, Friends and Communities

ignore gravity – Specializes in leadership development and innovation strategies for startups, corporations, and non-profits. Focuses on AI, Mixed Reality, and emerging technologies to create impact-driven programs.
CODE University – A private, state-recognized university in Berlin, backed by leading startup entrepreneurs. Provides a hands-on, project-based learning environment for the next generation of tech innovators.
Alaris Security – Focused on cybersecurity in enterprise and critical infrastructure, Alaris Security brings high-impact solutions and security expertise.
Cyber Curriculum – Equips the next generation of cybersecurity professionals through interactive, real-world training platforms.
WeAreDevelopers – The world’s leading community for developers, creating conferences and resources to elevate tech careers and innovation.
Grace - Accelerate Female Entrepreneurship – A Berlin-based accelerator that empowers women-led startups in the tech and innovation space.
Malt – Europe’s largest freelancer marketplace, connecting top tech and creative professionals with forward-thinking companies.
Microsoft Reactor – A hub for developers to learn, connect, and build with Microsoft technologies through workshops, meetups, and training sessions.
Tech: Europe – A platform showcasing European innovation, connecting the startup ecosystem through content, events, and advocacy.
GITEX Europe – One of the world’s largest tech and startup exhibitions, coming to Berlin in May 2025 to connect global innovators.

Crew and Team

Zaid Zaim – XR/AI at ignore gravity \| Microsoft MVP
Janek Fellien – Business Software Artist \| Micorosoft MVP
Adonis Almagro Ona – Product Owner @Carbyte
Aleksey Rusenov – Founder EU Gardens
Sohalia Mathur – Strategy \| Innovation & Sustainability \| Digital Transformation \| Consultant \| Trainer \| Coach
Ambesh Singh – Dynamics 365 CE Architect \| Azure \| Power Platform \| Power Pages & Microsoft Copilot Studio MVP (MCP \| MCTS \| MCSA \| MCSE)
Rokas Buciūnas – Event and Office Manager
Simon Gneuß – Software Engineer at Trail
Ian Baumeister – Full Stack Web Developer with Design and Startup Development Backgrounds
Joost Windmöller – Frontend Developer for OceanEcoWatch

Important Notice

Photos and videos will be taken during the event for community highlights and social media. If you prefer not to be photographed, please inform the organizers upon arrival.

Register & Learn More

📌 Claim your Digital Event Badge Code: MRLFHW 📌 Register for the event 📌 Meetup Event Page #GlobalAIBootcamp \| globalai.community

Global AI Bootcamp {Berlin} | In-person

Global AI Bootcamp {Berlin} | In-person 2025-04-11 · 11:00

Global AI Bootcamp Berlin 2025

For Community, By the Community

Join us for Global AI Bootcamp Berlin 2025, a dedicated space for Tech and AI enthusiasts, developers, and professionals to learn, share, and collaborate. This event is part of a global initiative bringing together AI experts and learners to explore the latest innovations, best practices, and real-world applications of Artificial Intelligence on Microsoft Azure.

Event Details

Date: April 11, 2025
Time: 13:00 – 20:30 CET
Location: CODE University of Applied Sciences
Topics: Azure AI, Copilots, AI Upskilling, Microsoft 365, AI Ethics

Pre-recorded Keynote Session

Hear from Scott Hanselman, Guido van Rossum, Jennifer Marsman, and Sarah Bird as they discuss AI’s impact on development, Python’s role, and the importance of ethical AI.

In-person Closing Keynote

Hear from Christian Heilmann, VP of DevRel at WeAreDevelopers, as he explores “Vibe coding, creativity, craft and professionalism – are we making ourselves redundant?” – a thought-provoking session on the evolving role of developers in the age of AI.

Speakers & Sessions at Global AI Bootcamp Berlin 2025

Welcome & Announcements – Presented by ignore gravity, CODE University, and the Global AI Community Reimar Müller-Thum – CEO at CODE University \| Peter Ruppel – President at CODE University Susanne Scheerer – Partner at ignore gravity Zaid Zaim – XR / AI at ignore gravity \| Micorosoft MVP \| Global AI Berlin Chapter Lead
Tanja Wiehoff – Build Your Own Intranet Agents in Minutes M365 Consultant at Communardo Software \| Microsoft MVP
René Schulte – The Next Frontier with Multimodal, Agentic & Embodied Intelligence Head of 3D and AI Practices at Reply \| Microsoft MVP / RD
Matthias Buchhorn-Roth & Alexander Preis – Custom AI Agents, No-Code Workflow, Conversational Interfaces Founders at PX4.ai
Marc Plogas – Semantic Kernel Multi-Agent Scenarios
Christian Weyer – Semantic AI: Language & Embedding Models Hand-in-Hand CTO & Co-founder at Thinktecture \| Microsoft MVP / RD
Marius Marx – Cyber Security in Enterprise & Critical Infrastructure Co-founder & CPO at Alaris Security
Daniela Burgos - From Rock Bottom to Startup Founder: My AI-Powered Comeback Nerea Co-founder and Grace Accelerator participant
Nicole Enders – HR Companion - Build a Bot with Azure AI Foundry & Microsoft Teams Managing Consultant at CONET Solutions \| Microsoft MVP
Ragnar Heil – 5 Different Shades of Copilot Agents: From SharePoint to Azure AI Search Business Development at HanseVision \| Microsoft MVP
Torben Köhler & Jona Schwarz – Metaphrase: Context-Aware i18n Automation for Modern Dev Pipelines Founders at Metaphrase
Sebastian Rosengrün & Dr. David Rump – Compliance Meets Innovation: Navigating the AI Act Founders at AI Impact Lab
Michael Greth – MikeOnAI – Live on Site with MVP Michael Greth

What to Expect?

Tech Talks & Workshops – Learn from AI experts about the latest trends, challenges, and advancements in AI, including Copilots, Azure AI, and AI-powered applications.
Hands-On Demos – Experience real-world AI use cases, explore AI-driven tools, and experiment with Azure AI services.
Networking & Collaboration – Connect with fellow AI enthusiasts, developers, and industry professionals to exchange ideas, share insights, and collaborate on AI projects.
Speaker Sessions & Panels – Gain insights from experienced AI practitioners sharing their knowledge, experiences, and best practices in AI-driven innovation.

Agenda \| Event Schedule

13:00 – 13:30 \| Check-In & Networking
13:30 – 13:45 \| Welcome & Opening
14:00 – 15:30 \| Keynote & AI Tech Talks
15:30 – 19:00 \| Hands-On Sessions & Workshops
19:00 – 20:30 \| Closing Keynote & Networking

(Agenda subject to updates.)

Global AI Bootcamp

The Global AI Community connects over 60,000 AI enthusiasts worldwide, fostering learning, collaboration, and skill development. The Global AI Bootcamp is a free, community-driven event dedicated to exploring AI's transformative potential, with a focus on Azure AI and Copilots.

Partners, Friends and Communities

ignore gravity – Specializes in leadership development and innovation strategies for startups, corporations, and non-profits. Focuses on AI, Mixed Reality, and emerging technologies to create impact-driven programs.
CODE University – A private, state-recognized university in Berlin, backed by leading startup entrepreneurs. Provides a hands-on, project-based learning environment for the next generation of tech innovators.
Alaris Security – Focused on cybersecurity in enterprise and critical infrastructure, Alaris Security brings high-impact solutions and security expertise.
Cyber Curriculum – Equips the next generation of cybersecurity professionals through interactive, real-world training platforms.
WeAreDevelopers – The world’s leading community for developers, creating conferences and resources to elevate tech careers and innovation.
Grace - Accelerate Female Entrepreneurship – A Berlin-based accelerator that empowers women-led startups in the tech and innovation space.
Malt – Europe’s largest freelancer marketplace, connecting top tech and creative professionals with forward-thinking companies.
Microsoft Reactor – A hub for developers to learn, connect, and build with Microsoft technologies through workshops, meetups, and training sessions.
Tech: Europe – A platform showcasing European innovation, connecting the startup ecosystem through content, events, and advocacy.
GITEX Europe – One of the world’s largest tech and startup exhibitions, coming to Berlin in May 2025 to connect global innovators.

Crew and Team

Zaid Zaim – XR/AI at ignore gravity \| Microsoft MVP
Janek Fellien – Business Software Artist \| Micorosoft MVP
Adonis Almagro Ona – Product Owner @Carbyte
Aleksey Rusenov – Founder EU Gardens
Sohalia Mathur – Strategy \| Innovation & Sustainability \| Digital Transformation \| Consultant \| Trainer \| Coach
Ambesh Singh – Dynamics 365 CE Architect \| Azure \| Power Platform \| Power Pages & Microsoft Copilot Studio MVP (MCP \| MCTS \| MCSA \| MCSE)
Rokas Buciūnas – Event and Office Manager
Simon Gneuß – Software Engineer at Trail
Ian Baumeister – Full Stack Web Developer with Design and Startup Development Backgrounds
Joost Windmöller – Frontend Developer for OceanEcoWatch

Important Notice

Photos and videos will be taken during the event for community highlights and social media. If you prefer not to be photographed, please inform the organizers upon arrival.

Register & Learn More

📌 Claim your Digital Event Badge Code: MRLFHW 📌 Register for the event 📌 Meetup Event Page #GlobalAIBootcamp \| globalai.community

Global AI Bootcamp {Berlin} | In-person

Global AI Bootcamp {Berlin} | In-person 2025-04-11 · 11:00

Global AI Bootcamp Berlin 2025

For Community, By the Community

Join us for Global AI Bootcamp Berlin 2025, a dedicated space for Tech and AI enthusiasts, developers, and professionals to learn, share, and collaborate. This event is part of a global initiative bringing together AI experts and learners to explore the latest innovations, best practices, and real-world applications of Artificial Intelligence on Microsoft Azure.

Event Details

Date: April 11, 2025
Time: 13:00 – 20:30 CET
Location: CODE University of Applied Sciences
Topics: Azure AI, Copilots, AI Upskilling, Microsoft 365, AI Ethics

Pre-recorded Keynote Session

Hear from Scott Hanselman, Guido van Rossum, Jennifer Marsman, and Sarah Bird as they discuss AI’s impact on development, Python’s role, and the importance of ethical AI.

In-person Closing Keynote

Hear from Christian Heilmann, VP of DevRel at WeAreDevelopers, as he explores “Vibe coding, creativity, craft and professionalism – are we making ourselves redundant?” – a thought-provoking session on the evolving role of developers in the age of AI.

Speakers & Sessions at Global AI Bootcamp Berlin 2025

Welcome & Announcements – Presented by ignore gravity, CODE University, and the Global AI Community Reimar Müller-Thum – CEO at CODE University \| Peter Ruppel – President at CODE University Susanne Scheerer – Partner at ignore gravity Zaid Zaim – XR / AI at ignore gravity \| Micorosoft MVP \| Global AI Berlin Chapter Lead
Tanja Wiehoff – Build Your Own Intranet Agents in Minutes M365 Consultant at Communardo Software \| Microsoft MVP
René Schulte – The Next Frontier with Multimodal, Agentic & Embodied Intelligence Head of 3D and AI Practices at Reply \| Microsoft MVP / RD
Matthias Buchhorn-Roth & Alexander Preis – Custom AI Agents, No-Code Workflow, Conversational Interfaces Founders at PX4.ai
Marc Plogas – Semantic Kernel Multi-Agent Scenarios
Christian Weyer – Semantic AI: Language & Embedding Models Hand-in-Hand CTO & Co-founder at Thinktecture \| Microsoft MVP / RD
Marius Marx – Cyber Security in Enterprise & Critical Infrastructure Co-founder & CPO at Alaris Security
Daniela Burgos - From Rock Bottom to Startup Founder: My AI-Powered Comeback Nerea Co-founder and Grace Accelerator participant
Nicole Enders – HR Companion - Build a Bot with Azure AI Foundry & Microsoft Teams Managing Consultant at CONET Solutions \| Microsoft MVP
Ragnar Heil – 5 Different Shades of Copilot Agents: From SharePoint to Azure AI Search Business Development at HanseVision \| Microsoft MVP
Torben Köhler & Jona Schwarz – Metaphrase: Context-Aware i18n Automation for Modern Dev Pipelines Founders at Metaphrase
Sebastian Rosengrün & Dr. David Rump – Compliance Meets Innovation: Navigating the AI Act Founders at AI Impact Lab
Michael Greth – MikeOnAI – Live on Site with MVP Michael Greth

What to Expect?

Tech Talks & Workshops – Learn from AI experts about the latest trends, challenges, and advancements in AI, including Copilots, Azure AI, and AI-powered applications.
Hands-On Demos – Experience real-world AI use cases, explore AI-driven tools, and experiment with Azure AI services.
Networking & Collaboration – Connect with fellow AI enthusiasts, developers, and industry professionals to exchange ideas, share insights, and collaborate on AI projects.
Speaker Sessions & Panels – Gain insights from experienced AI practitioners sharing their knowledge, experiences, and best practices in AI-driven innovation.

Agenda \| Event Schedule

13:00 – 13:30 \| Check-In & Networking
13:30 – 13:45 \| Welcome & Opening
14:00 – 15:30 \| Keynote & AI Tech Talks
15:30 – 19:00 \| Hands-On Sessions & Workshops
19:00 – 20:30 \| Closing Keynote & Networking

(Agenda subject to updates.)

Global AI Bootcamp

The Global AI Community connects over 60,000 AI enthusiasts worldwide, fostering learning, collaboration, and skill development. The Global AI Bootcamp is a free, community-driven event dedicated to exploring AI's transformative potential, with a focus on Azure AI and Copilots.

Partners, Friends and Communities

ignore gravity – Specializes in leadership development and innovation strategies for startups, corporations, and non-profits. Focuses on AI, Mixed Reality, and emerging technologies to create impact-driven programs.
CODE University – A private, state-recognized university in Berlin, backed by leading startup entrepreneurs. Provides a hands-on, project-based learning environment for the next generation of tech innovators.
Alaris Security – Focused on cybersecurity in enterprise and critical infrastructure, Alaris Security brings high-impact solutions and security expertise.
Cyber Curriculum – Equips the next generation of cybersecurity professionals through interactive, real-world training platforms.
WeAreDevelopers – The world’s leading community for developers, creating conferences and resources to elevate tech careers and innovation.
Grace - Accelerate Female Entrepreneurship – A Berlin-based accelerator that empowers women-led startups in the tech and innovation space.
Malt – Europe’s largest freelancer marketplace, connecting top tech and creative professionals with forward-thinking companies.
Microsoft Reactor – A hub for developers to learn, connect, and build with Microsoft technologies through workshops, meetups, and training sessions.
Tech: Europe – A platform showcasing European innovation, connecting the startup ecosystem through content, events, and advocacy.
GITEX Europe – One of the world’s largest tech and startup exhibitions, coming to Berlin in May 2025 to connect global innovators.

Crew and Team

Zaid Zaim – XR/AI at ignore gravity \| Microsoft MVP
Janek Fellien – Business Software Artist \| Micorosoft MVP
Adonis Almagro Ona – Product Owner @Carbyte
Aleksey Rusenov – Founder EU Gardens
Sohalia Mathur – Strategy \| Innovation & Sustainability \| Digital Transformation \| Consultant \| Trainer \| Coach
Ambesh Singh – Dynamics 365 CE Architect \| Azure \| Power Platform \| Power Pages & Microsoft Copilot Studio MVP (MCP \| MCTS \| MCSA \| MCSE)
Rokas Buciūnas – Event and Office Manager
Simon Gneuß – Software Engineer at Trail
Ian Baumeister – Full Stack Web Developer with Design and Startup Development Backgrounds
Joost Windmöller – Frontend Developer for OceanEcoWatch

Important Notice

Photos and videos will be taken during the event for community highlights and social media. If you prefer not to be photographed, please inform the organizers upon arrival.

Register & Learn More

📌 Claim your Digital Event Badge Code: MRLFHW 📌 Register for the event 📌 Meetup Event Page #GlobalAIBootcamp \| globalai.community

Global AI Bootcamp {Berlin} | In-person

AI Seminar #2: GenAI and AI Agent with Google and Intel 2025-03-19 · 16:00

Important: RSVP here to receive joining link. (rsvp on meetup will NOT receive joining link).

Join us for a series of AI-focused webinars designed to enhance your development skills, accelerate productivity, and explore the latest AI innovations. Whether you're building local LLMs, optimizing AI workflows, or deploying intelligent AI agents, these sessions—led by industry experts—will provide invaluable insights and hands-on experience.

Each session is tailored to different skill levels, from novice to advanced developers, offering deep technical insights and real-world applications. Register for each of the sessions:

Mar 12th (Intel): Enhance GenAI Productivity and scaling, RSVP->
Mar 19th (Intel): Building and Deploying AI Agents, RSVP->
Mar 20th (Google): Deep dive into Gemini, RSVP->
Mar 26th (Intel): Building AI assistants and Agents in the enterprise, RSVP->
April 2nd (Google): Deep dive into Gemini
If you can't make to the live session, still register to receive recordings. *

Session #2: Building and Deploying AI Agents with OPEA Speakers: Alex Sin (Intel) , Louie Tsai (Intel) Abstract: AI agents add new capabilities for responding intelligently to queries, data collection, and decision-making, assisted by additional functionality from Open Platforms for Enterprise AI (OPEA). Retrieval-Augmented Generation (RAG) bolstered by OPEA grants another level to agent design, strengthened by Intel® Gaudi® AI accelerators and Intel® Xeon® processors. This session provides guidance on designing, building, customizing, and deploying AI agents for hierarchical, multi-agent systems with greater success than non-agentic GenAI solutions

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers

AI Seminar #2: GenAI and AI Agent with Google and Intel

AI Seminar #2: GenAI and AI Agent with Google and Intel 2025-03-19 · 16:00

Important: RSVP here to receive joining link. (rsvp on meetup will NOT receive joining link).

Join us for a series of AI-focused webinars designed to enhance your development skills, accelerate productivity, and explore the latest AI innovations. Whether you're building local LLMs, optimizing AI workflows, or deploying intelligent AI agents, these sessions—led by industry experts—will provide invaluable insights and hands-on experience.

Each session is tailored to different skill levels, from novice to advanced developers, offering deep technical insights and real-world applications. Register for each of the sessions:

Mar 12th (Intel): Enhance GenAI Productivity and scaling, RSVP->
Mar 19th (Intel): Building and Deploying AI Agents, RSVP->
Mar 20th (Google): Deep dive into Gemini, RSVP->
Mar 26th (Intel): Building AI assistants and Agents in the enterprise, RSVP->
April 2nd (Google): Deep dive into Gemini
If you can't make to the live session, still register to receive recordings. *

Session #2: Building and Deploying AI Agents with OPEA Speakers: Alex Sin (Intel) , Louie Tsai (Intel) Abstract: AI agents add new capabilities for responding intelligently to queries, data collection, and decision-making, assisted by additional functionality from Open Platforms for Enterprise AI (OPEA). Retrieval-Augmented Generation (RAG) bolstered by OPEA grants another level to agent design, strengthened by Intel® Gaudi® AI accelerators and Intel® Xeon® processors. This session provides guidance on designing, building, customizing, and deploying AI agents for hierarchical, multi-agent systems with greater success than non-agentic GenAI solutions

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers

AI Seminar #2: GenAI and AI Agent with Google and Intel

AI Seminar #2: GenAI and AI Agent with Google and Intel 2025-03-19 · 16:00

Important: RSVP here to receive joining link. (rsvp on meetup will NOT receive joining link).

Join us for a series of AI-focused webinars designed to enhance your development skills, accelerate productivity, and explore the latest AI innovations. Whether you're building local LLMs, optimizing AI workflows, or deploying intelligent AI agents, these sessions—led by industry experts—will provide invaluable insights and hands-on experience.

Each session is tailored to different skill levels, from novice to advanced developers, offering deep technical insights and real-world applications. Register for each of the sessions:

Mar 12th (Intel): Enhance GenAI Productivity and scaling, RSVP->
Mar 19th (Intel): Building and Deploying AI Agents, RSVP->
Mar 20th (Google): Deep dive into Gemini, RSVP->
Mar 26th (Intel): Building AI assistants and Agents in the enterprise, RSVP->
April 2nd (Google): Deep dive into Gemini
If you can't make to the live session, still register to receive recordings. *

Session #2: Building and Deploying AI Agents with OPEA Speakers: Alex Sin (Intel) , Louie Tsai (Intel) Abstract: AI agents add new capabilities for responding intelligently to queries, data collection, and decision-making, assisted by additional functionality from Open Platforms for Enterprise AI (OPEA). Retrieval-Augmented Generation (RAG) bolstered by OPEA grants another level to agent design, strengthened by Intel® Gaudi® AI accelerators and Intel® Xeon® processors. This session provides guidance on designing, building, customizing, and deploying AI agents for hierarchical, multi-agent systems with greater success than non-agentic GenAI solutions

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers

AI Seminar #2: GenAI and AI Agent with Google and Intel

AI Seminar #2: GenAI and AI Agent with Google and Intel 2025-03-19 · 16:00

Important: RSVP here to receive joining link. (rsvp on meetup will NOT receive joining link).

Join us for a series of AI-focused webinars designed to enhance your development skills, accelerate productivity, and explore the latest AI innovations. Whether you're building local LLMs, optimizing AI workflows, or deploying intelligent AI agents, these sessions—led by industry experts—will provide invaluable insights and hands-on experience.

Each session is tailored to different skill levels, from novice to advanced developers, offering deep technical insights and real-world applications. Register for each of the sessions:

Mar 12th (Intel): Enhance GenAI Productivity and scaling, RSVP->
Mar 19th (Intel): Building and Deploying AI Agents, RSVP->
Mar 20th (Google): Deep dive into Gemini, RSVP->
Mar 26th (Intel): Building AI assistants and Agents in the enterprise, RSVP->
April 2nd (Google): Deep dive into Gemini
If you can't make to the live session, still register to receive recordings. *

Session #2: Building and Deploying AI Agents with OPEA Speakers: Alex Sin (Intel) , Louie Tsai (Intel) Abstract: AI agents add new capabilities for responding intelligently to queries, data collection, and decision-making, assisted by additional functionality from Open Platforms for Enterprise AI (OPEA). Retrieval-Augmented Generation (RAG) bolstered by OPEA grants another level to agent design, strengthened by Intel® Gaudi® AI accelerators and Intel® Xeon® processors. This session provides guidance on designing, building, customizing, and deploying AI agents for hierarchical, multi-agent systems with greater success than non-agentic GenAI solutions

Local and Global AI Community on Discord Join us on discord for local and global AI tech community:

Events chat: chat and connect with speakers and global and local attendees;
Learning AI: events, learning materials, study groups;
Startups: innovation, projects collaborations, founders/co-founders;
Jobs and Careers: job openings, post resumes, hiring managers

AI Seminar #2: GenAI and AI Agent with Google and Intel

People (8 results)

Activities & events

Global AI Bootcamp Berlin 2025

For Community, By the Community

Event Details

Pre-recorded Keynote Session

In-person Closing Keynote

Speakers & Sessions at Global AI Bootcamp Berlin 2025

What to Expect?

Agenda \| Event Schedule

Global AI Bootcamp

Partners, Friends and Communities

Crew and Team

** Important Notice**

Register & Learn More

Global AI Bootcamp Berlin 2025

For Community, By the Community

Event Details

Pre-recorded Keynote Session

In-person Closing Keynote

Speakers & Sessions at Global AI Bootcamp Berlin 2025

What to Expect?

Agenda \| Event Schedule

Global AI Bootcamp

Partners, Friends and Communities

Crew and Team

** Important Notice**

Register & Learn More

Global AI Bootcamp Berlin 2025

For Community, By the Community

Event Details

Pre-recorded Keynote Session

In-person Closing Keynote

Speakers & Sessions at Global AI Bootcamp Berlin 2025

What to Expect?

Agenda \| Event Schedule

Global AI Bootcamp

Partners, Friends and Communities

Crew and Team

** Important Notice**

Register & Learn More

Important Notice

Important Notice

Important Notice