talk-data.com talk-data.com

Description

Managing massive deployments of accelerators for AI and high performance computing (HPC) workloads can be complex. This talk dives into running AI-optimized Google Kubernetes Engine (GKE) clusters that streamline infrastructure provisioning, workload orchestration, and ongoing operations for tens of thousands of accelerators. Learn how topology-aware scheduling, maintenance controls, and advanced networking capabilities enable ultralow latency and maximum performance by default for demanding workloads like AI pretraining, fine-tuning, inference, and HPC.