talk-data.com talk-data.com

Company

magic.dev

Speakers

1

Activities

1

Speakers from magic.dev

Talks & appearances

1 activities from magic.dev speakers

If left unmanaged, failures and infrastructure inefficiencies can account for as much as 45% of your compute resources and precious engineering time (according to a Stanford University study). In this session, we discuss how to measure and maximize machine learning (ML) productivity for large-scale training jobs, spanning tens of thousands of accelerators. We’ll demonstrate a canonical view of large-scale training infrastructure and patterns our customers are applying that are available to you today.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.