Grappling with scaling your AI and machine learning (ML) platforms to meet demand and ensuring rapid recovery from failures? This session dives into strategies for optimizing end-to-end startup latency for AI and ML workloads on Google Kubernetes Engine (GKE). We’ll explore how image and pod preloading techniques can significantly reduce startup times, enabling faster scaling and improved reliability. Real-world examples will show how this has led to dramatic improvements in application performance, including a 95% reduction in pod startup time and 1.2x–2x speedup.
talk-data.com
G
Speaker
Gari Singh
2
talks
Outbound Product Manager
Google Cloud
Frequent Collaborators
Filtering by:
Google Cloud Next '25
×
Filter by Event / Source
Talks & appearances
Showing 2 of 5 activities
with
Gari Singh
(Google Cloud)
,
Drew Bradstock
(Google Cloud)
,
Basil Shikin
(AppLovin Corporation)
Ten years ago, Google Kubernetes Engine (GKE) was born! Since then, it has become the industry-leading managed Kubernetes platform, powering mission-critical workloads across all industries. But the innovations have just begun. Join this session to learn about the latest GKE features and upcoming innovations – such as next-generation autoscaling, lightning-fast node startup, and multi-cluster fleet management – that make GKE the best Kubernetes platform for the next generation of AI and modern workloads.