talk-data.com
Google Cloud Next
session
2025-04-10 at 23:30
Audio and visual interactions with Gemini 2.0 and Multimodal Live API
Event:
Google Cloud Next '25
Description
Experience a new way to interact with LLM-powered agents! With Gemini 2.0 and Multimodal Live API, users can give audible instructions and show visual content from a camera or screen, while receiving spoken responses from the model. This enables more natural, timely communication and unlocks multimodal agent workflows. This session showcases how existing agent experiences can be adapted for voice and visual cues, and explores new possibilities with this technology.