talk-data.com

Google Cloud Next session 2025-04-10 at 23:00

Serve open models on TPUs and GKE with superior portability and price-performance

Event: Google Cloud Next '25

Speakers

Kavitha Gowda

Senior Product Manager · Google Cloud

Jon Li

Software Engineer · Google Cloud

Mustafa Ozuysal

Senior ML Researcher · HUBX

Topics

AI/ML Kubernetes

Description

Facing challenges with the cost and performance of your AI inference workloads? This talk presents TPUs and Google Kubernetes Engine (GKE) as a solution for achieving both high throughput and low latency while optimizing costs with open source models and libraries. Learn how to leverage TPUs to scale massive inference workloads efficiently.