Jon Li

Activities

1

talks

Software Engineer Google Cloud

Filter by Event / Source

Google Cloud Next '25 1

Talks & appearances

1 activities · Newest first

Search activities →

Serve open models on TPUs and GKE with superior portability and price-performance

2025-04-10 · Google Cloud Next '25

session

with Jon Li (Google Cloud) , Mustafa Ozuysal (HUBX) , Kavitha Gowda (Google Cloud)

AI/ML Kubernetes

Facing challenges with the cost and performance of your AI inference workloads? This talk presents TPUs and Google Kubernetes Engine (GKE) as a solution for achieving both high throughput and low latency while optimizing costs with open source models and libraries. Learn how to leverage TPUs to scale massive inference workloads efficiently.