In this talk, we delve into the complexities of building enterprise AI applications, including customization, evaluation, and inference of large language models (LLMs). We start by outlining the solution design space and presenting a comprehensive LLM evaluation methodology. Then, we review state-of-the-art LLM customization techniques, introduce NVIDIA Inference Microservice (NIM) and a suite of cloud-native NVIDIA NeMo microservices for ease of LLM deployment and operation on Google Kubernetes Engine (GKE). We conclude with a live demo, followed by practical recommendations for enterprises.
Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.
talk-data.com
N
Speaker
Nik Spirin
1
talks
Director of Generative AI and LLMOps
NVIDIA
Filter by Event / Source
Talks & appearances
1 activities · Newest first