Kubernetes

Debugging apps on Google Kubernetes Engine

2025-04-11 · Google Cloud Next '25

session

Cloud Computing GCP

Debug Google Kubernetes Engine (GKE) apps like a pro! This hands-on lab covers using Cloud Logging & Monitoring to detect, diagnose, and resolve issues in a microservices application deployed on GKE. Learn practical troubleshooting workflows.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

Architectural approaches for RAG infrastructure

2025-04-11 · Google Cloud Next '25

session

by Megan O'Keefe (Google) , Kumar Dhanagopal (Google Cloud)

AI/ML Cloud Computing GCP GenAI RAG

Unlock the power of generative AI with retrieval augmented generation (RAG) on Google Cloud. In this session, we’ll navigate key architectural decisions to deploy and run RAG apps: from model and app hosting to data ingestion and vector store choice. We’ll cover reference architecture options – from an easy-to-deploy approach with Vertex AI RAG Engine, to a fully managed solution on Vertex AI, to a flexible DIY topology with Google Kubernetes Engine and open source tools – and compare trade-offs between operational simplicity and granular control.

Data on Kubernetes: Run stateful apps and AI workloads on GKE

2025-04-11 · Google Cloud Next '25

session

by Ugur Arpaci (Codeway) , Brian Kaufman (Google Cloud) , Volkan Aydingul (Codeway) , Thierry Damiba (Qdrant)

AI/ML Cloud Computing Cloud Storage Data Management

This session explores the evolution of data management on Kubernetes for AI and machine learning (ML) workloads and modern databases, including Google’s leadership in this space. We’ll discuss key challenges and solutions, including persistent storage with solutions like checkpointing and Cloud Storage FUSE, and accelerating data access with caching. Customers Qdrant and Codeway will share how they’ve successfully leveraged these technologies to improve their AI, ML, and database performance on Google Kubernetes Engine (GKE).

Effortless AI/ML: Accessing GPUs and TPUs on GKE made easy

2025-04-11 · Google Cloud Next '25

session

by Fisayo Feyisetan (Google Cloud) , Ed Shrager (Baseten) , Michal Zylinski (Google Cloud)

AI/ML

Running AI workloads on Google Kubernetes Engine (GKE) presents unique challenges, especially for securing the right hardware. Whether you’re dealing with unpredictable demand and varying job durations or simply looking to control costs, this session will equip you with the knowledge and tools to make informed decisions about your GKE AI infrastructure. We’ll explore recent advancements in Dynamic Workload Scheduler, custom compute classes, and Kueue, demonstrating how these technologies can help you effectively access and manage diverse hardware resources.

The need for speed: How our customers are slashing AI model startup latency

2025-04-11 · Google Cloud Next '25

session

by Gari Singh (Google Cloud) , Brandon Royal (Google Cloud)

AI/ML

Grappling with scaling your AI and machine learning (ML) platforms to meet demand and ensuring rapid recovery from failures? This session dives into strategies for optimizing end-to-end startup latency for AI and ML workloads on Google Kubernetes Engine (GKE). We’ll explore how image and pod preloading techniques can significantly reduce startup times, enabling faster scaling and improved reliability. Real-world examples will show how this has led to dramatic improvements in application performance, including a 95% reduction in pod startup time and 1.2x–2x speedup.

Build production-grade gen AI apps with Cloud SQL for MySQL and PostgreSQL in less than 30 minutes

2025-04-11 · Google Cloud Next '25

session

by Ravi Maganti (Manhattan Associates) , Shambhu Hegde (Google Cloud) , Isabella Lubin (Google Cloud)

AI/ML Cloud Computing GCP GenAI Cloud Run MySQL SQL postgresql

Time to make generative AI a reality for your application. This session is all about how to build high-performance gen AI applications fast with Cloud SQL for MySQL and PostgreSQL. Learn about Google Cloud’s innovative full-stack solutions that make gen AI app development, deployment, and operations simple and easy – even when deploying high-performance, production-grade applications. We’ll highlight best practices for getting started with Vertex AI, Cloud Run, Google Kubernetes Engine, and Cloud SQL, so that you can focus on gen AI application development from the get-go.

Deploy AlloyDB Omni on Kubernetes next to local AI models

2025-04-11 · Google Cloud Next '25

session

by Gleb Otochkin (Google Cloud)

AI/ML Cloud Computing GCP Omni

There are cases when you can’t use Google Cloud services but still want to get all benefits of AlloyDB integration with AI and serve a local model directly to the database. In such cases, AlloyDB Omni deployed in a Kubernetes cluster can be great solution, serving for edge cases and keeping all communications between database and AI model local.

GKE gen AI Inference: Deploy gen AI inference and save up to 30%

2025-04-11 · Google Cloud Next '25

session

by Vinay Kola (Snap) , Shub Shrivastava (Google Cloud) , Akshay Ram (Google)

AI/ML GenAI

Maximize your Gen AI inference performance on GKE. This session dives into the latest Kubernetes and GKE advancements, revealing how to achieve significant cost savings, reduced latency, and increased throughput. Discover new inference features on GKE for optimizing load balancing, scaling, accelerator selection, and overall usability. Plus, hear directly from Snap Inc. about their journey re-architecting their inference platform for the demands of Gen AI.

Manage compute resources and commitments effectively across Google Cloud

2025-04-11 · Google Cloud Next '25

session

by Yasmin Mowafy (Google Cloud) , Ari Liberman (Google Cloud)

AI/ML Cloud Computing GCP Cloud Run

Get the most out of your Google Cloud budget. This session covers cost-optimization strategies for Compute Engine and beyond, including Cloud Run, Vertex AI, and Autopilot in Google Kubernetes Engine. Learn how to effectively manage your capacity reservations and leverage consumption models like Spot VMs, Dynamic Workload Scheduler, and committed use discounts (CUDs) to achieve the optimum levels of capacity availability for your workloads while optimizing your cost.

eBPF gives GKE wings

2025-04-11 · Google Cloud Next '25

session

by Glen Yu (PwC Canada)

Cloud Computing Cyber Security

eBPF has revolutionized Kubernetes networking. Cilium, the leading eBPF-based container networking interface (CNI), is now emerging as a standard on major cloud providers like Google Kubernetes Engine (GKE). It provides superior scalability, security, and observability compared to traditional CNIs. eBPF also powers Hubble for network & security insights and Tetragon for runtime security enforcement. Find out how to leverage these tools to get the most out of your GKE cluster.

AI for startups: NVIDIA NIM™ Microservices + Google Cloud

2025-04-11 · Google Cloud Next '25

session

by Dimitri Maltezakis Vathypetrou (NVIDIA) , Brandon Royal (Google Cloud) , Chuck Freeman (NVIDIA)

AI/ML Cloud Computing GCP

Deploy and scale containerized AI models with NVIDIA NIMs on Google Kubernetes Engine (GKE). In this interactive session, you’ll gain hands-on experience deploying pre-built NIMs, managing deployments with kubectl, and autoscaling inference workloads. Ideal for startup developers, technical founders, and tech leads.

**Please bring your laptop to get the most out of this hands-on session**

Build an inferencing platform on GKE with Argo CD and fleets

2025-04-11 · Google Cloud Next '25

session

by Eddie Villalba (Google Cloud) , Trey Caliva (Abridge Inc) , Nick Eberts (Google Cloud)

Argo CD

This session provides a look into how Abridge built a secure and scalable inferencing platform on Google Kubernetes Engine (GKE). We’ll demonstrate how they leverage GKE fleets, Teams, Argo CD, and multi-cluster orchestration to manage and deploy inferencing workloads that span multiple clusters. View a live demo of a complete solution featuring a custom Argo CD plugin that simplifies cluster management and streamlines deployments for platform admins and application teams.

How Anthropic is pushing the computing limits of AI at scale with GKE

2025-04-11 · Google Cloud Next '25

session

by Artur Rodrigues (Anthropic) , Maciek Różacki (Google Cloud)

AI/ML

In this session, we’ll explore Google’s latest developments in Google Kubernetes Engine (GKE) that enable unprecedented scale and performance for AI workloads. We’ll dive into how Anthropic leverages these capabilities to manage mega-scale Kubernetes clusters, orchestrate diverse workloads, and achieve breakthrough efficiency optimizations.

Scaling multi-tenant AI platforms in the era of agentic AI with GKE

2025-04-11 · Google Cloud Next '25

session

by Abhishek Sawarkar (Nvidia) , Brandon Royal (Google Cloud) , Jeremy Schulman (Major League Baseball)

AI/ML Cloud Computing GCP

Is your platform ready for the scale of rapidly evolving models and agents? In this session, we’ll explore strategies for scaling your cloud native AI platform - empowering teams to leverage an increasing variety of AI models and agent frameworks. We’ll dive into tools and practices for maintaining control and cost efficiency while enabling AI engineering teams to quickly iterate on Google Kubernetes Engine (GKE). We’ll explore how NVIDIA NIM microservices deliver optimized inference with minimal tuning.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Log analytics on Google Cloud

2025-04-10 · Google Cloud Next '25

session

Analytics Cloud Computing GCP

Unlock the power of your application logs with Google Cloud Logging. This hands-on lab provides hands-on experience using Cloud Logging to gain deep insights into your applications, particularly on Google Kubernetes Engine. Learn to build effective queries and proactively address potential issues.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

Monitor performance for LLM training and inference workloads on GKE

2025-04-10 · Google Cloud Next '25

session

by James Maffey (Google Cloud) , Jie Wu (Google Cloud)

AI/ML Cloud Computing GCP LLM

Struggling to monitor the performance and health of your large language model (LLM) deployments on Google Kubernetes Engine (GKE)? This session unveils how the Google Cloud Observability suite provides a comprehensive solution for monitoring leading AI model servers like Ray, NVIDIA Triton, vLLM, TGI, and others. Learn how our one-click setup automatically configures dashboards, alerts, and critical metrics – including GPU and TPU utilization, latency, throughput, and error analysis – to enable faster troubleshooting and optimized performance. Discover how to gain complete visibility into your LLM infrastructure.

Serve open models on TPUs and GKE with superior portability and price-performance

2025-04-10 · Google Cloud Next '25

session

by Jon Li (Google Cloud) , Mustafa Ozuysal (HUBX) , Kavitha Gowda (Google Cloud)

AI/ML

Facing challenges with the cost and performance of your AI inference workloads? This talk presents TPUs and Google Kubernetes Engine (GKE) as a solution for achieving both high throughput and low latency while optimizing costs with open source models and libraries. Learn how to leverage TPUs to scale massive inference workloads efficiently.

Transforming your business with AI: The Kubernetes advantage

2025-04-10 · Google Cloud Next '25

session

by Robert Nishihara (Anyscale) , Kristian Lindwall (Spotify) , Gabe Monroy (Google Cloud) , Bobby Allen (Google Cloud)

AI/ML

Stop struggling to unlock the transformative power of AI. This session flips the script, revealing how your existing Kubernetes expertise is your greatest advantage. We'll demonstrate how Google Kubernetes Engine (GKE) provides the foundation for building scalable, custom AI platforms - empowering you to take control of your AI strategy. Forget starting from scratch; leverage existing skills to architect and deploy AI solutions for your unique needs. Discover how industry leaders like Spotify are harnessing GKE to fuel responsible innovation, and gain the insights to transform your Kubernetes knowledge into your ultimate AI superpower.

How video game studios use Google Cloud to power gen AI in games

2025-04-10 · Google Cloud Next '25

session

by Oddur Magnússon (Klang Games GMBH) , Ishan Sharma (Google Cloud)

AI/ML Cloud Computing GCP GenAI

Companies in the fiercely competitive gaming landscape face constant pressure to create engaging and ever-evolving player experiences. Generative AI can help game developers craft more dynamic, personalized gameplay while reducing time to market. Major game studios are leveraging Google Cloud’s cutting-edge AI capabilities to create immersive player experiences, personalized chatbots, dynamic character interactions, and user-generated content. We’ll show you how you can use Google Kubernetes Engine to easily integrate gen AI with game servers.

Build and deploy natively on Google Cloud with Oracle Database@Google Cloud

2025-04-10 · Google Cloud Next '25

session

by Martin Paynter (Google Cloud) , Can Tuzla (Oracle) , Gerald Venzl (Oracle)

AI/ML BigQuery Cloud Computing GCP Cloud Run LLM Oracle

Build modern applications with the power of Oracle Database 23ai, and Google Cloud's Vertex AI and Gemini Foundation models. Learn key strategies to integrate Google Cloud’s native development tools and services, including Kubernetes, Cloud Run, and BigQuery, with Oracle Database 23ai and Autonomous Database, seamlessly into modern application architectures. Cloud architects, Developers, or DB Administrators will gain actionable insight, best practices, and real-world examples to enhance performance and accelerate innovation with ODB@GC.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

talk-data.com

Activity Trend

Top Events

Top Speakers

Debugging apps on Google Kubernetes Engine

Architectural approaches for RAG infrastructure

Data on Kubernetes: Run stateful apps and AI workloads on GKE

Effortless AI/ML: Accessing GPUs and TPUs on GKE made easy

The need for speed: How our customers are slashing AI model startup latency

Build production-grade gen AI apps with Cloud SQL for MySQL and PostgreSQL in less than 30 minutes

Deploy AlloyDB Omni on Kubernetes next to local AI models

GKE gen AI Inference: Deploy gen AI inference and save up to 30%

Manage compute resources and commitments effectively across Google Cloud

eBPF gives GKE wings

AI for startups: NVIDIA NIM™ Microservices + Google Cloud

Build an inferencing platform on GKE with Argo CD and fleets

How Anthropic is pushing the computing limits of AI at scale with GKE

Scaling multi-tenant AI platforms in the era of agentic AI with GKE

Log analytics on Google Cloud

Monitor performance for LLM training and inference workloads on GKE

Serve open models on TPUs and GKE with superior portability and price-performance

Transforming your business with AI: The Kubernetes advantage

How video game studios use Google Cloud to power gen AI in games

Build and deploy natively on Google Cloud with Oracle Database@Google Cloud