Maximize your Gen AI inference performance on GKE. This session dives into the latest Kubernetes and GKE advancements, revealing how to achieve significant cost savings, reduced latency, and increased throughput. Discover new inference features on GKE for optimizing load balancing, scaling, accelerator selection, and overall usability. Plus, hear directly from Snap Inc. about their journey re-architecting their inference platform for the demands of Gen AI.
talk-data.com
A
Speaker
Akshay Ram
2
talks
Group Product Manager, GKE
Google
Filter by Event / Source
Talks & appearances
2 activities · Newest first
The number of clusters running data apps on Google Kubernetes Engine has grown exponentially, doubling every year since 2019. With the rise of AI/ML along with accelerated compute, data architectures are gaining importance. Join this session to learn about Kubernetes data architectures for AI/ML, storage best practices, data availability and customer use cases. This session is meant to educate you about retooling your skill set for the new paradigm of data on Kubernetes.
Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.