interest-compute

AI in action: Optimize your AI infrastructure

2025-04-11 · Google Cloud Next '25

session

by Moontae Lee (LG AI Research) , Cesar Naranjo (Moloco) , Chelsie Czop (Google Cloud) , Kshetrajna Radhaven (Shopify) , Newfel Harrat (Google Cloud) , Kasper Piskorski, PhD (Technology Innovation Institute)

AI/ML Cloud Computing GCP

AI Hypercomputer is a revolutionary system designed to make implementing AI at scale easier and more efficient. In this session, we’ll explore the key benefits of AI Hypercomputer and how it simplifies complex AI infrastructure environments. Then, learn firsthand from industry leaders Shopify, Technology Innovation Institute, Moloco, and LG AI Research on how they leverage Google Cloud’s AI solutions to drive innovation and transform their businesses.

Migrate from AWS and Azure to Google Cloud runtimes

2025-04-11 · Google Cloud Next '25

session

by Vrinda Khurjekar (Searce) , Jatin Sharma (Google) , Bernhard Pfirrmann (Nokia) , Eitan Eibschutz (Google Cloud)

AWS Azure Cloud Computing GCP LLM

Migrating from AWS or Azure to Google Cloud runtimes can feel like navigating a maze of complex services and dependencies. In this session, we’ll explore key considerations for migrating legacy applications, emphasizing the “why not modernize?” approach with a practice guide. We’ll share real-world examples of successful transformations. And we’ll go beyond theory with a live product demo that showcases migration tools, and a code assessment demo powered by Gemini that demonstrates how you can understand and modernize legacy code.

Secure your cloud with Google Cloud’s security and compliance innovations

2025-04-11 · Google Cloud Next '25

session

by Simon Bennett (Schroders) , Erlander Lo (Google Cloud)

Cloud Computing GCP Cyber Security

Learn how to leverage Google Cloud’s innovations to build the most secure and compliant solutions with Confidential Computing, offloads, and Assured Workload capabilities.

Debugging apps on Google Kubernetes Engine

2025-04-11 · Google Cloud Next '25

session

Cloud Computing GCP Kubernetes

Debug Google Kubernetes Engine (GKE) apps like a pro! This hands-on lab covers using Cloud Logging & Monitoring to detect, diagnose, and resolve issues in a microservices application deployed on GKE. Learn practical troubleshooting workflows.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

The need for speed: How our customers are slashing AI model startup latency

2025-04-11 · Google Cloud Next '25

session

by Gari Singh (Google Cloud) , Brandon Royal (Google Cloud)

AI/ML Kubernetes

Grappling with scaling your AI and machine learning (ML) platforms to meet demand and ensuring rapid recovery from failures? This session dives into strategies for optimizing end-to-end startup latency for AI and ML workloads on Google Kubernetes Engine (GKE). We’ll explore how image and pod preloading techniques can significantly reduce startup times, enabling faster scaling and improved reliability. Real-world examples will show how this has led to dramatic improvements in application performance, including a 95% reduction in pod startup time and 1.2x–2x speedup.

Google-scale AI infrastructure: A look under the hood

2025-04-11 · Google Cloud Next '25

session

by Diwakar Gupta (Google Cloud) , Connor McCoy (Google Cloud)

AI/ML

This session provides an in-depth look at the Google infrastructure that powers our most demanding AI workloads. We’ll explore the journey from custom silicon, high-bandwidth networking and storage to the software frameworks that enable efficient, large-scale training and inference with industry-leading goodput and uptime across the largest GPUs and TPU clusters. Learn how Google’s unique approach to system design and deployment enables customers to effortlessly achieve Google-level performance and scale for their own applications.

Manage compute resources and commitments effectively across Google Cloud

2025-04-11 · Google Cloud Next '25

session

by Yasmin Mowafy (Google Cloud) , Ari Liberman (Google Cloud)

AI/ML Cloud Computing GCP Cloud Run Kubernetes

Get the most out of your Google Cloud budget. This session covers cost-optimization strategies for Compute Engine and beyond, including Cloud Run, Vertex AI, and Autopilot in Google Kubernetes Engine. Learn how to effectively manage your capacity reservations and leverage consumption models like Spot VMs, Dynamic Workload Scheduler, and committed use discounts (CUDs) to achieve the optimum levels of capacity availability for your workloads while optimizing your cost.

Unlock value for your workloads: Microsoft, Oracle, OpenShift, and more

2025-04-11 · Google Cloud Next '25

session

by Venkat Gattamaneni (Google Cloud) , Alex Joseph (Google Cloud) , David Allard (Kinaxis Inc.) , Ravi Ravishankar (Google Cloud) , Justin Brodley (Blackline)

Cloud Computing GCP Microsoft Oracle

Unlock the full potential of your mission-critical workloads on Google Cloud. Discover how our platform is purpose-built for Microsoft, Oracle, OpenShift, and more, enabling you to optimize total cost of ownership (TCO) and accelerate modernization. Learn firsthand from customers who have successfully transformed their businesses by bringing their workloads to Google Cloud.

Compute Engine best practices: Optimizing cost, workload management, and scalability

2025-04-11 · Google Cloud Next '25

session

by Pawel Wenda (Google Cloud)

AI/ML Cloud Computing

Unlock the full potential of Compute Engine for all your applications. This session delivers actionable strategies and best practices to optimize cost, reliability, and management for cloud-first, AI, machine learning, high performance computing, enterprise, and stateful workloads. We’ll share recently released features within Compute Engine to maximize return on investment for each specific application type.

Scaling AI/ML Workloads with Ray on TPUs

2025-04-11 · Google Cloud Next '25

session

by WenXin Dong (Google Cloud)

AI/ML API Python

Tensor Processing Units (TPUs) are a hardware accelerator designed by Google specifically for large-scale AI/ML computations. Google's new Trillium TPUs are our most performant and energy-efficient TPUs to date, and offer unprecedented levels of scalability. Ray is a unified framework for orchestrating AI/ML workloads on large compute clusters. Ray offers Python-native APIs for training, inference, tuning, reinforcement learning, and more. In this lightning talk, we will demonstrate how you can use Ray to manage workloads on TPUs with an easy-to-use API. We will cover: 1) Training your models with MaxText, 2) Tuning models with Huggingface, and 3) Serving models with vLLM. Audience can gain an understanding of how to build a complete, end-to-end AI/ML infrastructure with Ray and TPUs.

Continuous delivery with Google Cloud Deploy

2025-04-11 · Google Cloud Next '25

session

CI/CD Cloud Computing GCP

Build & deploy with Google Cloud Deploy! This hands-on lab equips you to create delivery pipelines, deploy container images to Artifact Registry, and promote applications across GKE environments.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

How Anthropic is pushing the computing limits of AI at scale with GKE

2025-04-11 · Google Cloud Next '25

session

by Artur Rodrigues (Anthropic) , Maciek Różacki (Google Cloud)

AI/ML Kubernetes

In this session, we’ll explore Google’s latest developments in Google Kubernetes Engine (GKE) that enable unprecedented scale and performance for AI workloads. We’ll dive into how Anthropic leverages these capabilities to manage mega-scale Kubernetes clusters, orchestrate diverse workloads, and achieve breakthrough efficiency optimizations.

Scaling multi-tenant AI platforms in the era of agentic AI with GKE

2025-04-11 · Google Cloud Next '25

session

by Abhishek Sawarkar (Nvidia) , Brandon Royal (Google Cloud) , Jeremy Schulman (Major League Baseball)

AI/ML Cloud Computing GCP Kubernetes

Is your platform ready for the scale of rapidly evolving models and agents? In this session, we’ll explore strategies for scaling your cloud native AI platform - empowering teams to leverage an increasing variety of AI models and agent frameworks. We’ll dive into tools and practices for maintaining control and cost efficiency while enabling AI engineering teams to quickly iterate on Google Kubernetes Engine (GKE). We’ll explore how NVIDIA NIM microservices deliver optimized inference with minimal tuning.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

vLLM on Google Cloud: Fast and easy-to-use LLM inference serving on TPUs and GPUs

2025-04-11 · Google Cloud Next '25

session

by Woosuk Kwon (Google DeepMind) , Brittany Rockwell (Google Cloud) , Robert Shaw (Redhat)

Cloud Computing GCP LLM

Join Woosuk Kwon, Founder of vLLM, Robert Shaw, Director of Engineering at Red Hat, and Brittany Rockwell, Product Manager for vLLM on TPU, to learn about how vLLM is helping Google Cloud customers serve state-of-the-art models with high performance and ease of use across TPUs and GPUs.

Scale AI and business-critical workloads with Intel on Google Cloud

2025-04-11 · Google Cloud Next '25

session

by Kartik Manocha (Intel) , Olivia Melendez (Google)

AI/ML Cloud Computing GCP

Discover the power of Google Cloud instances running on the latest Intel Xeon processors. This course will introduce you to Intel’s optimization tools, designed to help you manage and optimize your infrastructure with unmatched efficiency and performance. Learn how to leverage these cutting-edge technologies to enhance your cloud computing capabilities and drive your business forward.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

How LG AI Research uses AI Hypercomputer to build EXAONE Gen AI Models and experiences

2025-04-10 · Google Cloud Next '25

session

by Honglak Lee (LG AI Research) , Pramod Ramarao (Google Cloud)

AI/ML Cloud Computing GCP GenAI LLM

Learn how LG AI Research uses Google Cloud AI Hypercomputer to build their EXAONE family of LLMs and innovative Agentic AI experiences based the models. EXAONE 3.5, class of bilingual models that can learn and understand both Korean and English, recorded world-class performance in Korean. The collaboration between LG AI Research and Google Cloud enabled LG to significantly enhance model performance, reduce inference time, and improve resource efficiency through Google Cloud's easy-to-use scalable infrastructure

Log analytics on Google Cloud

2025-04-10 · Google Cloud Next '25

session

Analytics Cloud Computing GCP Kubernetes

Unlock the power of your application logs with Google Cloud Logging. This hands-on lab provides hands-on experience using Cloud Logging to gain deep insights into your applications, particularly on Google Kubernetes Engine. Learn to build effective queries and proactively address potential issues.

If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!

Monitor performance for LLM training and inference workloads on GKE

2025-04-10 · Google Cloud Next '25

session

by James Maffey (Google Cloud) , Jie Wu (Google Cloud)

AI/ML Cloud Computing GCP Kubernetes LLM

Struggling to monitor the performance and health of your large language model (LLM) deployments on Google Kubernetes Engine (GKE)? This session unveils how the Google Cloud Observability suite provides a comprehensive solution for monitoring leading AI model servers like Ray, NVIDIA Triton, vLLM, TGI, and others. Learn how our one-click setup automatically configures dashboards, alerts, and critical metrics – including GPU and TPU utilization, latency, throughput, and error analysis – to enable faster troubleshooting and optimized performance. Discover how to gain complete visibility into your LLM infrastructure.

Rapidly (and safely) migrate your VMware estate to Google Cloud

2025-04-10 · Google Cloud Next '25

session

by Sai Gopalan (Google Cloud) , Ken Bocchino (Google Cloud) , Stéphane Vanhoucke (Renault)

Cloud Computing GCP VMware

Looking to rapidly migrate your VMware estate on-premises or elsewhere to a cloud-integrated, Google-operated VMware platform that delivers 99.99% (four nines) availability of cluster-level uptime, flexible node configurations, and deeply integrated networking, along with high-bandwidth and low-latency access to other Google Cloud services? Want all this with minimal disruption to your existing tools or team’s skill sets? Join this session to get a primer on Google Cloud VMware Engine and its latest, learn from a leading customer about their transformation journey, and get started today.

See it live: Supercharge your mainframe modernization with AI-powered tools

2025-04-10 · Google Cloud Next '25

session

by Yoav Reich (Google Cloud) , Rob Mee (Mechanical Orchard) , Leonid Vasetsky (Google Cloud)

AI/ML Cloud Computing GCP GenAI

Experience a live demo of Google Cloud’s approach to mainframe modernization using generative AI. This demo will showcase the modernization life cycle, from initial assessment to code rewrite to risk mitigation. We’ll illustrate how our agentic approach streamlines the modernization process, reducing time, budget, and resource requirements. And we’ll demonstrate how to minimize the risk of modernizing business-critical applications through testing and by enabling parallel execution of both original and modernized applications with Dual Run.

talk-data.com

Activity Trend

Top Events

Top Speakers

AI in action: Optimize your AI infrastructure

Migrate from AWS and Azure to Google Cloud runtimes

Secure your cloud with Google Cloud’s security and compliance innovations

Debugging apps on Google Kubernetes Engine

The need for speed: How our customers are slashing AI model startup latency

Google-scale AI infrastructure: A look under the hood

Manage compute resources and commitments effectively across Google Cloud

Unlock value for your workloads: Microsoft, Oracle, OpenShift, and more

Compute Engine best practices: Optimizing cost, workload management, and scalability

Scaling AI/ML Workloads with Ray on TPUs

Continuous delivery with Google Cloud Deploy

How Anthropic is pushing the computing limits of AI at scale with GKE

Scaling multi-tenant AI platforms in the era of agentic AI with GKE

vLLM on Google Cloud: Fast and easy-to-use LLM inference serving on TPUs and GPUs

Scale AI and business-critical workloads with Intel on Google Cloud

How LG AI Research uses AI Hypercomputer to build EXAONE Gen AI Models and experiences

Log analytics on Google Cloud

Monitor performance for LLM training and inference workloads on GKE

Rapidly (and safely) migrate your VMware estate to Google Cloud

See it live: Supercharge your mainframe modernization with AI-powered tools