talk-data.com talk-data.com

Company

Google DeepMind

Speakers

21

Activities

34

Speakers from Google DeepMind

Talks & appearances

34 activities from Google DeepMind speakers

Training models on large-scale data has given us powerful generative capabilities for text, images, and video. However, this success has not yet extended to training generalist embodied agents. This talk tackles this gap by focusing on a potential solution to this problem: scalable world models. We'll trace the idea of planning in predictive models, from its origins to modern efforts on building world models directly from pixels. I'll discuss the primary challenge of scaling these models and present our work, Genie, which enables us to learn world models without explicit action labels at scale, demonstrating a new path forward for training the generalist agents of the future.

In the last couple of years we've seen rapid evolution frontier, massive sized models. Yet at the same time small models have been going through an evolution of their own, using technologies developer for those frontier scaled models. In this talk we'll show how tensor frameworks and autograd made their way into Bayesian models, how massive model development is yielding smaller models, and how both of these are useful for the small data and model developers, and the organizations they support.

Abstract: Our innate ability to reconstruct the 3D world around us from our eyes alone is a fundamental part of human perception. For computers, however, this task remained a significant challenge — until the advent of Neural Radiance Fields (NeRFs). Upon their introduction, NeRFs marked a paradigm shift in the field of novel view synthesis, demonstrating huge improvements in visual realism and geometric accuracy over prior works. The subsequent proliferation of NeRF variants has only expanded their capabilities, unlocking larger scenes, achieving even higher visual fidelity, and accelerating both training and inference. Nevertheless, NeRF is no longer the tool of choice for 3D reconstruction. Why? Join a researcher from the front lines as we explore NeRF’s foundations, dissect its strengths and weaknesses, see how the field has evolved, and explore the future of novel view synthesis.

Gemini 2.0 was built for the agentic era – from native tool use to function calling to robust support for multimodal understanding, the new frontier of applications are agentic. Join this session to explore the frontier of agents, where the best opportunities are for developers to build, open research areas to scale to billions of agents, and how to best leverage Gemini.

Language models have already evolved to do much more than language tasks, principally in the domain of image, audio, and soon video. Join Mostafa Dehghani to explore the emergent frontier of multimodal generation, what Gemini’s world knowledge unlocks that domain specific models cannot create, and how developers should be thinking about AI as a next-generation creative partner.

World models represent a paradigm shift in artificial intelligence, moving beyond passive data consumption to active, predictive understanding of environments. These models enable AI agents to simulate potential futures, plan strategically, and learn more efficiently in complex, dynamic scenarios.  In this session, Tim Brooks, Research Scientist at Google DeepMind, will explore the current state of world model research and illuminate the exciting frontiers that lie ahead.

 

Gemini 2.0, the latest foundational model released by Google DeepMind, offers improved performance, real-time interactions support, text-to-image and text-to-audio generations, Google Search grounding, and reasoning – all under a unified SDK that allows you to flawlessly navigate from the Gemini API to Vertex AI. In this talk, you’ll learn about the newest Gemini 2.0 capabilities, how to accelerate your prototyping, and guidelines to deploy your solutions from a single API to more complex pipelines.

session
Logan Kilpatrick (Senior Product Manager) , Paige Bailey (AI Developer Experience Engineer)

Software engineering has become increasingly complex, with an ever-expanding set of patterns, frameworks, and runtimes. But help is here! AI is revolutionizing the developer workflow, and Google Cloud is reimagining the journey from idea to production. This keynote features demos that showcase how AI can streamline software engineering,  empowering you to build important apps, services, and agents faster than ever.