talk-data.com talk-data.com

D

Speaker

Deepak Patil

2

talks

Group Product Manager, Machine Learning Infrastructure Google Cloud

Frequent Collaborators

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →

In this session learn about performance optimizations for PyTorch on Google Cloud accelerators using OpenXLA. These models are powerful but can be disrupted by resource failures. This talk also explores strategies for achieving greater resiliency when running PyTorch on GPUs, focusing on fault tolerance, checkpointing, and distributed training. Learn how to leverage open source tools to minimize downtime and ensure your deep learning workloads run smoothly.

In this session learn about performance optimizations for PyTorch on Google Cloud accelerators using OpenXLA. These models are powerful but can be disrupted by resource failures. This talk also explores strategies for achieving greater resiliency when running PyTorch on GPUs, focusing on fault tolerance, checkpointing, and distributed training. Learn how to leverage open source tools to minimize downtime and ensure your deep learning workloads run smoothly.