talk-data.com talk-data.com

Google Cloud Next session 2025-04-11 at 00:15

vLLM on Google Cloud: Fast and easy-to-use LLM inference serving on TPUs and GPUs

Description

Join Woosuk Kwon, Founder of vLLM, Robert Shaw, Director of Engineering at Red Hat, and Brittany Rockwell, Product Manager for vLLM on TPU, to learn about how vLLM is helping Google Cloud customers serve state-of-the-art models with high performance and ease of use across TPUs and GPUs.