Nik Spirin

Activities

1

talks

Director of Generative AI and LLMOps NVIDIA

Filter by Event / Source

Google Cloud Next '24 1

Talks & appearances

1 activities · Newest first

Search activities →

Accelerating large language models with NVIDIA NIM and NeMo on Google Kubernetes Engine

2024-04-10 · Google Cloud Next '24

session

with Brandon Royal (Google Cloud) , Nik Spirin (NVIDIA)

AI/ML Kubernetes LLM

In this talk, we delve into the complexities of building enterprise AI applications, including customization, evaluation, and inference of large language models (LLMs). We start by outlining the solution design space and presenting a comprehensive LLM evaluation methodology. Then, we review state-of-the-art LLM customization techniques, introduce NVIDIA Inference Microservice (NIM) and a suite of cloud-native NVIDIA NeMo microservices for ease of LLM deployment and operation on Google Kubernetes Engine (GKE). We conclude with a live demo, followed by practical recommendations for enterprises.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.