talk-data.com talk-data.com

J

Speaker

Jyotinder Singh

1

talks

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →

Large language models are often too large to run on personal machines, requiring specialized hardware with massive memory. Quantization provides a way to shrink models, speed them up, and reduce memory usage - all while retaining most of their accuracy.

This talk introduces the fundamentals of neural network quantization, key techniques, and demonstrates how to apply them using Keras’s extensible quantization framework.