talk-data.com talk-data.com

B

Speaker

Brandon Royal

3

talks

Product Manager, AI on Google Kubernetes Engine Google Cloud
Filtering by: Google Cloud Next '25 ×

Filter by Event / Source

Talks & appearances

Showing 3 of 5 activities

Search activities →

Grappling with scaling your AI and machine learning (ML) platforms to meet demand and ensuring rapid recovery from failures? This session dives into strategies for optimizing end-to-end startup latency for AI and ML workloads on Google Kubernetes Engine (GKE). We’ll explore how image and pod preloading techniques can significantly reduce startup times, enabling faster scaling and improved reliability. Real-world examples will show how this has led to dramatic improvements in application performance, including a 95% reduction in pod startup time and 1.2x–2x speedup.

Deploy and scale containerized AI models with NVIDIA NIMs on Google Kubernetes Engine (GKE). In this interactive session, you’ll gain hands-on experience deploying pre-built NIMs, managing deployments with kubectl, and autoscaling inference workloads. Ideal for startup developers, technical founders, and tech leads.

**Please bring your laptop to get the most out of this hands-on session**

Is your platform ready for the scale of rapidly evolving models and agents? In this session, we’ll explore strategies for scaling your cloud native AI platform - empowering teams to leverage an increasing variety of AI models and agent frameworks. We’ll dive into tools and practices for maintaining control and cost efficiency while enabling AI engineering teams to quickly iterate on Google Kubernetes Engine (GKE). We’ll explore how NVIDIA NIM microservices deliver optimized inference with minimal tuning. 

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.