talk-data.com talk-data.com

YouTube 2025-12-03 at 07:06

AWS re:Invent 2025 - Train high-performing AI models at scale on AWS (AIM365)

Description

Training large AI models requires significant compute resources and can be time and cost intensive. In this session, learn to optimize and accelerate your model training workloads using AWS's purpose-built infrastructure and tools. We'll dive deep into leveraging services like Amazon SageMaker HyperPod for distributed training at scale and SageMaker fully managed training jobs for cost-effective ML acceleration. You'll learn to scale training across clusters using techniques like data and model parallelism, automated model tuning, and efficient checkpoint management. Through real customer examples, see how to reduce training time by up to 40%, optimize costs, and build high-performance training pipelines.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS