Time to make generative AI a reality for your application. This session is all about how to build high-performance gen AI applications fast with Cloud SQL for MySQL and PostgreSQL. Learn about Google Cloud’s innovative full-stack solutions that make gen AI app development, deployment, and operations simple and easy – even when deploying high-performance, production-grade applications. We’ll highlight best practices for getting started with Vertex AI, Cloud Run, Google Kubernetes Engine, and Cloud SQL, so that you can focus on gen AI application development from the get-go.
talk-data.com
Topic
Kubernetes
560
tagged
Activity Trend
Top Events
There are cases when you can’t use Google Cloud services but still want to get all benefits of AlloyDB integration with AI and serve a local model directly to the database. In such cases, AlloyDB Omni deployed in a Kubernetes cluster can be great solution, serving for edge cases and keeping all communications between database and AI model local.
Maximize your Gen AI inference performance on GKE. This session dives into the latest Kubernetes and GKE advancements, revealing how to achieve significant cost savings, reduced latency, and increased throughput. Discover new inference features on GKE for optimizing load balancing, scaling, accelerator selection, and overall usability. Plus, hear directly from Snap Inc. about their journey re-architecting their inference platform for the demands of Gen AI.
Get the most out of your Google Cloud budget. This session covers cost-optimization strategies for Compute Engine and beyond, including Cloud Run, Vertex AI, and Autopilot in Google Kubernetes Engine. Learn how to effectively manage your capacity reservations and leverage consumption models like Spot VMs, Dynamic Workload Scheduler, and committed use discounts (CUDs) to achieve the optimum levels of capacity availability for your workloads while optimizing your cost.
eBPF has revolutionized Kubernetes networking. Cilium, the leading eBPF-based container networking interface (CNI), is now emerging as a standard on major cloud providers like Google Kubernetes Engine (GKE). It provides superior scalability, security, and observability compared to traditional CNIs. eBPF also powers Hubble for network & security insights and Tetragon for runtime security enforcement. Find out how to leverage these tools to get the most out of your GKE cluster.
Deploy and scale containerized AI models with NVIDIA NIMs on Google Kubernetes Engine (GKE). In this interactive session, you’ll gain hands-on experience deploying pre-built NIMs, managing deployments with kubectl, and autoscaling inference workloads. Ideal for startup developers, technical founders, and tech leads.
**Please bring your laptop to get the most out of this hands-on session**
This session provides a look into how Abridge built a secure and scalable inferencing platform on Google Kubernetes Engine (GKE). We’ll demonstrate how they leverage GKE fleets, Teams, Argo CD, and multi-cluster orchestration to manage and deploy inferencing workloads that span multiple clusters. View a live demo of a complete solution featuring a custom Argo CD plugin that simplifies cluster management and streamlines deployments for platform admins and application teams.
In this session, we’ll explore Google’s latest developments in Google Kubernetes Engine (GKE) that enable unprecedented scale and performance for AI workloads. We’ll dive into how Anthropic leverages these capabilities to manage mega-scale Kubernetes clusters, orchestrate diverse workloads, and achieve breakthrough efficiency optimizations.
Is your platform ready for the scale of rapidly evolving models and agents? In this session, we’ll explore strategies for scaling your cloud native AI platform - empowering teams to leverage an increasing variety of AI models and agent frameworks. We’ll dive into tools and practices for maintaining control and cost efficiency while enabling AI engineering teams to quickly iterate on Google Kubernetes Engine (GKE). We’ll explore how NVIDIA NIM microservices deliver optimized inference with minimal tuning.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Unlock the power of your application logs with Google Cloud Logging. This hands-on lab provides hands-on experience using Cloud Logging to gain deep insights into your applications, particularly on Google Kubernetes Engine. Learn to build effective queries and proactively address potential issues.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
Struggling to monitor the performance and health of your large language model (LLM) deployments on Google Kubernetes Engine (GKE)? This session unveils how the Google Cloud Observability suite provides a comprehensive solution for monitoring leading AI model servers like Ray, NVIDIA Triton, vLLM, TGI, and others. Learn how our one-click setup automatically configures dashboards, alerts, and critical metrics – including GPU and TPU utilization, latency, throughput, and error analysis – to enable faster troubleshooting and optimized performance. Discover how to gain complete visibility into your LLM infrastructure.
Facing challenges with the cost and performance of your AI inference workloads? This talk presents TPUs and Google Kubernetes Engine (GKE) as a solution for achieving both high throughput and low latency while optimizing costs with open source models and libraries. Learn how to leverage TPUs to scale massive inference workloads efficiently.
Stop struggling to unlock the transformative power of AI. This session flips the script, revealing how your existing Kubernetes expertise is your greatest advantage. We'll demonstrate how Google Kubernetes Engine (GKE) provides the foundation for building scalable, custom AI platforms - empowering you to take control of your AI strategy. Forget starting from scratch; leverage existing skills to architect and deploy AI solutions for your unique needs. Discover how industry leaders like Spotify are harnessing GKE to fuel responsible innovation, and gain the insights to transform your Kubernetes knowledge into your ultimate AI superpower.
Companies in the fiercely competitive gaming landscape face constant pressure to create engaging and ever-evolving player experiences. Generative AI can help game developers craft more dynamic, personalized gameplay while reducing time to market. Major game studios are leveraging Google Cloud’s cutting-edge AI capabilities to create immersive player experiences, personalized chatbots, dynamic character interactions, and user-generated content. We’ll show you how you can use Google Kubernetes Engine to easily integrate gen AI with game servers.
Build modern applications with the power of Oracle Database 23ai, and Google Cloud's Vertex AI and Gemini Foundation models. Learn key strategies to integrate Google Cloud’s native development tools and services, including Kubernetes, Cloud Run, and BigQuery, with Oracle Database 23ai and Autonomous Database, seamlessly into modern application architectures. Cloud architects, Developers, or DB Administrators will gain actionable insight, best practices, and real-world examples to enhance performance and accelerate innovation with ODB@GC.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Demo sur Dapr AI Agents dans l'environnement Kubernetes AKS.
20 min demo - 10 min questions sur DAPR AI Agents dans l'environnement Kubernetes (AKS) pour se focaliser sur le code et oublier l'existence de l'infrastructure.
Discover how Renault transformed automotive software development (SDV) with Google Cloud. By replacing physical prototypes with Android-based virtualization, they accelerated their SDV life cycle and moved to a cloud-first, iterative approach. Learn how they leverage Cloud Workstations, Gemini Code Assist, and a continuous integration and continuous testing (CI/CT) pipeline powered by Google Kubernetes Engine and GitLab to boost developer productivity and bring new features to market faster.
Managing massive deployments of accelerators for AI and high performance computing (HPC) workloads can be complex. This talk dives into running AI-optimized Google Kubernetes Engine (GKE) clusters that streamline infrastructure provisioning, workload orchestration, and ongoing operations for tens of thousands of accelerators. Learn how topology-aware scheduling, maintenance controls, and advanced networking capabilities enable ultralow latency and maximum performance by default for demanding workloads like AI pretraining, fine-tuning, inference, and HPC.
Join this session where Shopify engineers will discuss how they leverage the latest Google Kubernetes Engine (GKE) innovations to build robust, scalable platforms that not only handle everyday traffic with ease but also gracefully absorb unpredictable spikes during peak events like Black Friday and Cyber Monday. Learn key architectural patterns, smart infrastructure choices, and proven best practices. Discover how to optimize resource utilization, control costs, and deliver cost-effective performance every time.