AI Hypercomputer is a revolutionary system designed to make implementing AI at scale easier and more efficient. In this session, we’ll explore the key benefits of AI Hypercomputer and how it simplifies complex AI infrastructure environments. Then, learn firsthand from industry leaders Shopify, Technology Innovation Institute, Moloco, and LG AI Research on how they leverage Google Cloud’s AI solutions to drive innovation and transform their businesses.
talk-data.com
Topic
interest-compute
83
tagged
Activity Trend
Migrating from AWS or Azure to Google Cloud runtimes can feel like navigating a maze of complex services and dependencies. In this session, we’ll explore key considerations for migrating legacy applications, emphasizing the “why not modernize?” approach with a practice guide. We’ll share real-world examples of successful transformations. And we’ll go beyond theory with a live product demo that showcases migration tools, and a code assessment demo powered by Gemini that demonstrates how you can understand and modernize legacy code.
Learn how to leverage Google Cloud’s innovations to build the most secure and compliant solutions with Confidential Computing, offloads, and Assured Workload capabilities.
Debug Google Kubernetes Engine (GKE) apps like a pro! This hands-on lab covers using Cloud Logging & Monitoring to detect, diagnose, and resolve issues in a microservices application deployed on GKE. Learn practical troubleshooting workflows.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
Grappling with scaling your AI and machine learning (ML) platforms to meet demand and ensuring rapid recovery from failures? This session dives into strategies for optimizing end-to-end startup latency for AI and ML workloads on Google Kubernetes Engine (GKE). We’ll explore how image and pod preloading techniques can significantly reduce startup times, enabling faster scaling and improved reliability. Real-world examples will show how this has led to dramatic improvements in application performance, including a 95% reduction in pod startup time and 1.2x–2x speedup.
This session provides an in-depth look at the Google infrastructure that powers our most demanding AI workloads. We’ll explore the journey from custom silicon, high-bandwidth networking and storage to the software frameworks that enable efficient, large-scale training and inference with industry-leading goodput and uptime across the largest GPUs and TPU clusters. Learn how Google’s unique approach to system design and deployment enables customers to effortlessly achieve Google-level performance and scale for their own applications.
Get the most out of your Google Cloud budget. This session covers cost-optimization strategies for Compute Engine and beyond, including Cloud Run, Vertex AI, and Autopilot in Google Kubernetes Engine. Learn how to effectively manage your capacity reservations and leverage consumption models like Spot VMs, Dynamic Workload Scheduler, and committed use discounts (CUDs) to achieve the optimum levels of capacity availability for your workloads while optimizing your cost.
Unlock the full potential of your mission-critical workloads on Google Cloud. Discover how our platform is purpose-built for Microsoft, Oracle, OpenShift, and more, enabling you to optimize total cost of ownership (TCO) and accelerate modernization. Learn firsthand from customers who have successfully transformed their businesses by bringing their workloads to Google Cloud.
Unlock the full potential of Compute Engine for all your applications. This session delivers actionable strategies and best practices to optimize cost, reliability, and management for cloud-first, AI, machine learning, high performance computing, enterprise, and stateful workloads. We’ll share recently released features within Compute Engine to maximize return on investment for each specific application type.
Tensor Processing Units (TPUs) are a hardware accelerator designed by Google specifically for large-scale AI/ML computations. Google's new Trillium TPUs are our most performant and energy-efficient TPUs to date, and offer unprecedented levels of scalability. Ray is a unified framework for orchestrating AI/ML workloads on large compute clusters. Ray offers Python-native APIs for training, inference, tuning, reinforcement learning, and more. In this lightning talk, we will demonstrate how you can use Ray to manage workloads on TPUs with an easy-to-use API. We will cover: 1) Training your models with MaxText, 2) Tuning models with Huggingface, and 3) Serving models with vLLM. Audience can gain an understanding of how to build a complete, end-to-end AI/ML infrastructure with Ray and TPUs.
Build & deploy with Google Cloud Deploy! This hands-on lab equips you to create delivery pipelines, deploy container images to Artifact Registry, and promote applications across GKE environments.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
In this session, we’ll explore Google’s latest developments in Google Kubernetes Engine (GKE) that enable unprecedented scale and performance for AI workloads. We’ll dive into how Anthropic leverages these capabilities to manage mega-scale Kubernetes clusters, orchestrate diverse workloads, and achieve breakthrough efficiency optimizations.
Is your platform ready for the scale of rapidly evolving models and agents? In this session, we’ll explore strategies for scaling your cloud native AI platform - empowering teams to leverage an increasing variety of AI models and agent frameworks. We’ll dive into tools and practices for maintaining control and cost efficiency while enabling AI engineering teams to quickly iterate on Google Kubernetes Engine (GKE). We’ll explore how NVIDIA NIM microservices deliver optimized inference with minimal tuning.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Join Woosuk Kwon, Founder of vLLM, Robert Shaw, Director of Engineering at Red Hat, and Brittany Rockwell, Product Manager for vLLM on TPU, to learn about how vLLM is helping Google Cloud customers serve state-of-the-art models with high performance and ease of use across TPUs and GPUs.
Discover the power of Google Cloud instances running on the latest Intel Xeon processors. This course will introduce you to Intel’s optimization tools, designed to help you manage and optimize your infrastructure with unmatched efficiency and performance. Learn how to leverage these cutting-edge technologies to enhance your cloud computing capabilities and drive your business forward.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Learn how LG AI Research uses Google Cloud AI Hypercomputer to build their EXAONE family of LLMs and innovative Agentic AI experiences based the models. EXAONE 3.5, class of bilingual models that can learn and understand both Korean and English, recorded world-class performance in Korean. The collaboration between LG AI Research and Google Cloud enabled LG to significantly enhance model performance, reduce inference time, and improve resource efficiency through Google Cloud's easy-to-use scalable infrastructure
Unlock the power of your application logs with Google Cloud Logging. This hands-on lab provides hands-on experience using Cloud Logging to gain deep insights into your applications, particularly on Google Kubernetes Engine. Learn to build effective queries and proactively address potential issues.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
Struggling to monitor the performance and health of your large language model (LLM) deployments on Google Kubernetes Engine (GKE)? This session unveils how the Google Cloud Observability suite provides a comprehensive solution for monitoring leading AI model servers like Ray, NVIDIA Triton, vLLM, TGI, and others. Learn how our one-click setup automatically configures dashboards, alerts, and critical metrics – including GPU and TPU utilization, latency, throughput, and error analysis – to enable faster troubleshooting and optimized performance. Discover how to gain complete visibility into your LLM infrastructure.
Looking to rapidly migrate your VMware estate on-premises or elsewhere to a cloud-integrated, Google-operated VMware platform that delivers 99.99% (four nines) availability of cluster-level uptime, flexible node configurations, and deeply integrated networking, along with high-bandwidth and low-latency access to other Google Cloud services? Want all this with minimal disruption to your existing tools or team’s skill sets? Join this session to get a primer on Google Cloud VMware Engine and its latest, learn from a leading customer about their transformation journey, and get started today.
Experience a live demo of Google Cloud’s approach to mainframe modernization using generative AI. This demo will showcase the modernization life cycle, from initial assessment to code rewrite to risk mitigation. We’ll illustrate how our agentic approach streamlines the modernization process, reducing time, budget, and resource requirements. And we’ll demonstrate how to minimize the risk of modernizing business-critical applications through testing and by enabling parallel execution of both original and modernized applications with Dual Run.