Text-to-image generative AI models such as the Stable Diffusion family of models are rapidly growing in popularity. In this session, we explain how to optimize every layer of your serving architecture – including TPU accelerators, orchestration, model server, and ML framework – to gain significant improvements in performance and cost effectiveness. We introduce many new innovations in Google Kubernetes Engine that improve the cost effectiveness of AI inference, and we provide a deep dive into MaxDiffusion, a brand new library for deploying scalable stable diffusion workloads on TPUs.
Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.
talk-data.com
J
Speaker
Juan Acevedo
1
talks
Software Engineer
Google Cloud
Frequent Collaborators
Filtering by:
Google Cloud Next '24
×
Filter by Event / Source
Talks & appearances
Showing 1 of 3 activities