Anyscale

Transforming your business with AI: The Kubernetes advantage

2025-04-10 · Google Cloud Next '25

session

Robert Nishihara (Cofounder)

AI/ML Kubernetes

Stop struggling to unlock the transformative power of AI. This session flips the script, revealing how your existing Kubernetes expertise is your greatest advantage. We'll demonstrate how Google Kubernetes Engine (GKE) provides the foundation for building scalable, custom AI platforms - empowering you to take control of your AI strategy. Forget starting from scratch; leverage existing skills to architect and deploy AI solutions for your unique needs. Discover how industry leaders like Spotify are harnessing GKE to fuel responsible innovation, and gain the insights to transform your Kubernetes knowledge into your ultimate AI superpower.

Ray on GKE: Building a next-generation AI/ML platform

2025-04-10 · Google Cloud Next '25

session

Edward Oakes (Staff Engineer)

AI/ML Kubernetes

Leverage the best of Ray and Google Kubernetes Engine (GKE) to build your next-generation machine learning (ML) platform. Google and Anyscale are making Ray and Kubernetes the distributing operating system for AI/ML. Discover how on GKE, Ray enables you to deliver a unified platform to scale your workloads from development to large-scale production. Learn about GKE’s latest advancements for fast data access, intelligent scheduling, and optimized utilization of hardware accelerators, and how Ray and Anyscale RayTurbo are enhancing that with best-in-class performance, efficiency, and developer productivity. Leave this session equipped to build a scalable AI/ML platform to empower your researchers and engineers.

Using the power of Apache Airflow and Ray for Scalable AI deployments

2024-07-01 · Airflow Summit 2024

session

Marwan Sarieddine (AI Engineer)

AI/ML Airflow LLM

Many organizations struggle to create a well-orchestrated AI infrastructure, using separate and disconnected platforms for data processing, model training, and inference, which slows down development and increases costs. There’s a clear need for a unified system that can handle all aspects of AI development and deployment, regardless of the size of data or models. Join our breakout session to see how our comprehensive solution simplifies the development and deployment of large language models in production. Learn how to streamline your AI operations by implementing an end-to-end ML lifecycle on your custom data, including - automated LLM fine-tuning, LLM evaluation & LLM serving and LoRA deployments

Scalable advanced ML systems with Ray, Google Kubernetes Engine, and ML accelerators

2024-04-09 · Google Cloud Next '24

session

Richard Liaw (Product Manager)

AI/ML API

As machine learning (ML) systems continue to evolve, the ability to scale complex ML workloads becomes crucial. Scalability can be considered along two dimensions: expansive training of large language models (LLMs) and intricate distribution of reinforcement learning (RL) systems. Each has its own set of challenges, from computational demands of LLMs to complex synchronization in distributed RL.

This session explores the integration of Ray, Google Kubernetes Engine (GKE) and ML accelerators like tensor processing units (TPUs) as a powerful combination to develop advanced ML systems at scale. We discuss Ray and its scalable APIs, its mature integration with GKE and ML accelerators, and demonstrate how it has been used for LLMs and re-implementing the powerful RL algorithm, Muzero.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Build RAG-based large language model applications with Ray on Google Kubernetes Engine

2024-04-09 · Google Cloud Next '24

session

Kai-Hsun Chen (Software Engineer)

Kubernetes

Large Language Models (LLMs) have changed the way we interact with information. A base LLM is only aware of the information it was trained on. Retrieval augmented generation (RAG) can address this issue by providing context of additional data sources. In this session, we’ll build a RAG-based LLM application that incorporates external data sources to augment an OSS LLM. We’ll show how to scale the workload with distributed kubernetes compute, and showcase a chatbot agent that gives factual answers.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.