talk-data.com talk-data.com

Company

NVIDIA

Speakers

47

Activities

59

Speakers from NVIDIA

Talks & appearances

59 activities from NVIDIA speakers

Implementing generative AI applications requires large amounts of computation that can seamlessly scale to train, fine-tune, and serve the models. NVIDIA and Google Cloud have partnered to offer a range of GPU options to address this challenge. Using NVIDIA GPUs with Google Kubernetes Engine removes the heavy lifting needed to set up AI deployments, automate orchestration, manage large training clusters, and serve low-latency inference. Join us to see what ElevenLabs has built using NVIDIA GPUs with GKE. Please note: seating is limited and on a first-come, first served basis; standing areas are available

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Any new technology brings potential new risks, and generative AI is no exception. In this session, we will cut through the FUD around gen AI deployments - helping you understand the real risks, both existing and novel, with deploying AI systems in the cloud. We’ll also provide a framework for how to support the safe use of AI in your organization, so that security, privacy, and risk concerns, real and imagined, can be addressed effectively without slowing business velocity.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Traditional infrastructure is no longer adequate for the exponentially growing demands of generative AI and LLMs. Join this session to learn how infrastructure design is meeting those demands, how organizations are adapting to capitalize on the new infrastructure landscape, and how this may evolve in future.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

NVIDIA AI on Google Cloud provides all the essential tools, frameworks, and models needed to develop and deploy custom models and generative AI applications. This Cloud Talk will share the latest NVIDIA technologies available on Google Cloud to rapidly enable enterprises to take their LLMs and generative AI applications from pilot into production. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Any new technology brings potential new risks, and generative AI is no exception. In this session, we will cut through the FUD around gen AI deployments - helping you understand the real risks, both existing and novel, with deploying AI systems in the cloud. We’ll also provide a framework for how to support the safe use of AI in your organization, so that security, privacy, and risk concerns, real and imagined, can be addressed effectively without slowing business velocity.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

In this session learn about performance optimizations for PyTorch on Google Cloud accelerators using OpenXLA. These models are powerful but can be disrupted by resource failures. This talk also explores strategies for achieving greater resiliency when running PyTorch on GPUs, focusing on fault tolerance, checkpointing, and distributed training. Learn how to leverage open source tools to minimize downtime and ensure your deep learning workloads run smoothly.

By default, we think sequentially. Parallelism and asynchrony are often seen as challenging and complex. Tools to be used sparingly and cautiously, and only by experts. But we must shatter these assumptions, for today, we live in a parallel world. Almost every hardware platform is parallel, from the smallest embedded devices to the largest supercomputers. We must change our mindset. Anyone who writes code has to think in parallel. Parallelism must become our default. In this example-driven talk, we will journey into the world of parallelism. We'll look at four algorithms and data structures in depth, comparing and contrasting different implementation strategies and exploring how they will perform both sequentially and in parallel. During this voyage, we'll uncover and discuss some foundational principles of parallelism, such as latency hiding, localizing communication, and efficiency vs performance tradeoffs. By the time we're done, you'll be thinking in parallel.

Join NVIDIA, Microsoft, and Ansys (part of Synopsys), to learn how digital twins and real-time simulation are powering the future of manufacturing. Discover how Ansys is leveraging NVIDIA accelerated computing, NVIDIA AI and Omniverse™ libraries, Microsoft Azure, and OpenUSD to help manufacturers accelerate time to insight and optimize manufacturing processes and operations.

The presentation is focused on practical aspects of transition to reactive architecture. It discusses the problems of processing large volumes of requests and managing distributed services. The presentation emphasizes moving from traditional blocking operations to non-blocking, asynchronous processes to improve scalability, fault tolerance, and performance. Key points include:

  • Introduction to Reactive Architecture: Understanding the need for a scalable and fault-tolerant architecture that efficiently handles client requests while minimizing errors.
  • Reactive Architecture Solutions: Discusses how the principles of reactive architecture, including non-blocking I/O, event-driven programming, and microservices, address these challenges, resulting in more efficient resource utilization, reduced latency, and improved overall system performance.
  • Practical Applications: Examples of reactive architecture implementations in real-world scenarios, especially in environments requiring high parallelism and low latency processing.
  • Challenges in transition to reactive architecture

The goal of the presentation is to educate the audience on the benefits of moving from blocking operations to a more efficient, reactive approach, ultimately leading to more responsive and scalable systems.