Inference at scale with Google Cloud’s AI Hypercomputer
Learn how to run high-throughput and low-latency inference on Google Cloud to maximize price-performance on TPUs and GPUs, leveraging JetStream and vLLM.
Activities tracked
410
Google Cloud Next Conference (25)
Sessions & talks
Showing 301–325 of 410 · Newest first
Learn how to run high-throughput and low-latency inference on Google Cloud to maximize price-performance on TPUs and GPUs, leveraging JetStream and vLLM.
Debugging modern applications demands a new level of observability expertise. Traditional monitoring struggles with these complex distributed environments. This session goes beyond dashboards, exploring expert observability strategies tailored for the generative AI era. We’ll cover essential instrumentation with OpenTelemetry and eBPF, system health overviews using Metrics Explorer and Trace Explorer, and in-depth data analysis with SQL queries in Observability Analytics.
Join us for the Partner Summit Keynote as we chart the course for shared success in 2025 and beyond. Discover how Google Cloud is empowering partners with new programs, resources, and initiatives designed to drive customer value and unlock exponential growth.
We live in a different era – one where developers need access to larger tools, data and code need to be more secure, and ultimately still more work needs to get done. Learn about methods on Google Cloud to leverage remote development effectively for all your software development needs, from Day 1 to production. You’ll experience Cloud Shell in action, how Cloud Workstations powers your AI development needs, and container and developer tools that make deployment simple. Power to the devs.
Want to deploy generative AI across your organization but not sure how to keep your sensitive data secure and compliant? Join this session to hear from industry practitioners and Google experts about the best practices and lessons learned when embarking on this journey. We will demo how you can use built-in controls to identify sensitive data in your organization and restrict access to it and share insights, admin control recommendations, and lived customer experiences.
Incidents occur across time and topology. Customers using multiple GCP services spend hours troubleshooting. Learn about Gemini Cloud Assist Investigations, which provides the full range of troubleshooting and support, including structured workflows, signal analysis, and ability to recommend solutions across your applications, services and workloads, as well as its components such as compute, networking, storage, databases and data processing pipelines. With Gemini Cloud Assist Investigations, you can understand your environmental data patterns better, perform root cause analysis on your applications faster, and improve efficiency with warm handoffs between Gemini Cloud Assist and Google Support.
Is siloed data hindering your operations? Learn how Nuro consolidated their transactional, relational, and vector data sets on AlloyDB for PostgreSQL, and how they’re now able to do operational analysis, real-time analytics, and business intelligence (BI) reports on the same platform. Join this panel session to discover best practices for unifying vector and relational data, and learn how Nuro is now able to satisfy their self-driven car analytics use cases in a cost-effective way.
BigQuery is unifying data management, analytics, governance, and AI. Join this session to learn about the latest innovations in BigQuery to help you get actionable insights from your multimodal data and accelerate AI innovation with a secure data foundation and new-gen AI-powered experiences. Hear how Mattel utilized BigQuery to create a no-code, shareable template for data processing, analytics, and AI modeling, leveraging their existing data and streamlining the entire workflow from ETL to AI implementation within a single platform.
Simplify database management with Database Center. This generative AI-powered solution provides a centralized view of your entire database fleet, helping you monitor availability, security, compliance, and data protection. In this session, you’ll learn how to easily detect performance issues, visualize granular metrics, and get AI-powered recommendations for optimization. You’ll also discover how customers are streamlining their database operations and boosting efficiency with Database Center.
The rise of AI agents is poised to disrupt every industry, transforming the way we work and conduct business. This session will explore the transformative power of agentic workflows and provide a glimpse into what an agentic future might entail across industries. We'll discuss the strategic implications of this emerging technology and equip business leaders with the knowledge and insights needed to navigate the agentic frontier and capitalize on the opportunities it presents.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Security Command Center is the must-have risk management solution for Google Cloud. In this session, you’ll learn how to use Security Command Center to proactively manage threats, posture, data, and identities, and how to respond to security events to protect your entire cloud footprint.
Worried about learning how to access the latest AI models? Explore the expanding Model Garden ecosystem, including new first-party models, open-source contributions, and third-party offerings from partners like Anthropic, Meta, Mistral AI, and the Allen Institute. Learn how you can find and deploy the perfect models for your projects as easy as 1, 2, 3.
The convergence of high performance computing (HPC) and generative AI is revolutionizing research. By analyzing vast datasets and generating novel hypotheses, researchers can accelerate innovation and breakthroughs. This session explores the exciting possibilities of gen AI in research, from climate modeling to drug discovery and beyond.
AI is reshaping software engineering. Learn how Google’s engineering team prioritizes transformative AI investments for both immediate and long-term value.
Join us for a deep dive into the latest advancements in Gemini Code Assist. This session will unveil exciting new features designed to boost productivity, enhance software quality, and streamline modernization efforts.
AI Hypercomputer is a revolutionary system designed to make implementing AI at scale easier and more efficient. In this session, we’ll explore the key benefits of AI Hypercomputer and how it simplifies complex AI infrastructure environments. Then, learn firsthand from industry leaders Shopify, Technology Innovation Institute, Moloco, and LG AI Research on how they leverage Google Cloud’s AI solutions to drive innovation and transform their businesses.
According to IDC, 67% of AI technology spend in 2025 will be enterprises embedding AI capabilities into core business operations. Learn how AI-enabled customer engagements, from search to customer care operations and insights, are helping organizations improve customer experience and employee productivity while lowering costs. In this session, we’ll discuss specific use cases and the business results customers have achieved.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Legacy productivity solutions are not designed to enable real-time, AI-assisted teamwork and reduce security and reliability risks. Join us to discover how a secure enterprise browser, a cloud-first OS, and cloud-first productivity apps can not only make your teams more collaborative and effective but also reduce the IT workload, software costs, and security risks. You’ll learn about Google’s approach, explore customer success stories, and discuss strategies to help your organization begin the journey.
Decode the future of startup funding. Join top VC leaders for a dynamic discussion on key trends, emerging technologies, and winning investment strategies shaping the next year. Whether you’re seeking funding or planning your next move, this is a must-attend session to gain crucial insights to navigate the evolving startup landscape.
The Gemma family is growing. Discover the latest additions to this powerful collection of open models. This session explores new Gemma models, highlighting advancements in performance and capabilities for diverse AI applications. Learn about the expanding Gemma ecosystem, including tools, community resources, and model evaluation techniques to find the perfect fit for your project.
This technical deep dive explores how small IT teams can leverage Google Kubernetes Engine (GKE) and AI Hypercomputer to build, refine, and optimize a cutting-edge, scalable, and secure container platform for AI workloads.
Ten years ago, Google Kubernetes Engine (GKE) was born! Since then, it has become the industry-leading managed Kubernetes platform, powering mission-critical workloads across all industries. But the innovations have just begun. Join this session to learn about the latest GKE features and upcoming innovations – such as next-generation autoscaling, lightning-fast node startup, and multi-cluster fleet management – that make GKE the best Kubernetes platform for the next generation of AI and modern workloads.
Google brings together the scalability, reliability and ease-of-use of Firestore with MongoDB compatibility. The session will showcase Firestore with MongoDB compatibility and its capabilities. In addition, Mayo Clinic will present their use of Firestore for multiple workloads including using Firestore’s GenAI capabilities for delivering personalized experiences in their applications.
Automation enables engineering teams to reduce duplication of effort and build consistency around various processes, especially observability. Providing out-of-the-box solutions and using infrastructure as code are some of the ways to automate your systems so all of your teams can onboard and get all the right features. In this talk, we will discuss how Datadog and Project44 have created self-service platforms so their engineers can automatically obtain observability into their systems.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Facing the challenges of scaling multimodal search for massive datasets? This session shows how leading retailers scaled their search infrastructure using Vertex AI Vector Search. Discover how the power of multimodal hybrid search of Vector Search 2.0 dramatically improves your search experience.