talk-data.com
Google Cloud Next
session
2025-04-09 at 21:45
PyTorch on Google Cloud: From experimentation to production
Event:
Google Cloud Next '25
Speakers
Topics
Description
In this session learn about performance optimizations for PyTorch on Google Cloud accelerators using OpenXLA. These models are powerful but can be disrupted by resource failures. This talk also explores strategies for achieving greater resiliency when running PyTorch on GPUs, focusing on fault tolerance, checkpointing, and distributed training. Learn how to leverage open source tools to minimize downtime and ensure your deep learning workloads run smoothly.