Trying to eek out every ounce of performance of your LLM? Trying to speed up inference or understand what is going on inside a language model? In this session, you will learn how to profile a model on AWS purpose built accelerators and build a custom kernel to achieve better performance.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

talk-data.com

AWS re:Invent 2025 - Performance engineering on Neuron: How to optimize your LLM with NKI (AIM414)

Description

AWSreInvent #AWSreInvent2025 #AWS