This session explores the evolution of data management on Kubernetes for AI and machine learning (ML) workloads and modern databases, including Google’s leadership in this space. We’ll discuss key challenges and solutions, including persistent storage with solutions like checkpointing and Cloud Storage FUSE, and accelerating data access with caching. Customers Qdrant and Codeway will share how they’ve successfully leveraged these technologies to improve their AI, ML, and database performance on Google Kubernetes Engine (GKE).
talk-data.com
Company
Codeway
Speakers
3
Activities
2
Speakers from Codeway
Talks & appearances
2 activities from Codeway speakers
Deploying AI models at scale demands high-performance inference capabilities. Google Cloud offers a range of cloud tensor processing units (TPUs) and NVIDIA-powered graphics processing unit (GPU) VMs. This session will guide you through the key considerations for choosing TPUs and GPUs for your inference needs. Explore the strengths of each accelerator for various workloads like large language models and generative AI models. Discover how to deploy and optimize your inference pipeline on Google Cloud using TPUs or GPUs. Understand the cost implications and explore cost-optimization strategies.
Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.