talk-data.com talk-data.com

Hugo Affaticati

Speaker

Hugo Affaticati

2

talks

Senior Cloud Infrastructure Engineer Microsoft

Hugo Affaticati is a data-driven engineer specializing in enterprise AI training and inference at scale. He serves as Senior Cloud Infrastructure Engineer at Microsoft Azure and as Technical Lead for the Azure AI Benchmarking Team. His work includes delivering high-throughput real-world inference, achieving 865,000 tokens/sec on a rack of ND GB200 v6-series GPUs accelerated by NVIDIA NVL72 GB200 GPUs and the Azure software stack.

Bio from: Microsoft Ignite 2025

Frequent Collaborators

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
Pushing limits of supercomputing innovation on Azure AI Infra

Training efficiency starts with precision. This session explores Azure supercomputing validation—from GPU kernels to LLAMA pretraining and large-scale model training. The process detects bottlenecks early, reduces cost, and boosts performance. Customers gain predictable throughput, faster training, and confidence in Azure’s readiness for multi-billion parameter models. Attendees will gain practical insights and engage directly with the engineers driving these innovations.

Inference at record speed with Azure ND Virtual Machines

Azure sets new inference records with 865K and 1.1M tokens/sec on ND GB200/GB300 v6 VMs. These results stem from deep stack optimization—from GPU kernels like GEMM and attention to multi-node scaling. Using LLAMA benchmarks, we’ll show how model architecture and hardware codesign drive throughput and efficiency. Customers benefit from faster time-to-value, lower cost per token, and production-ready infrastructure. Attendees can connect with Azure engineers to discuss best practices.