talk-data.com talk-data.com

Topic

Keras

deep_learning neural_networks machine_learning

1

tagged

Activity Trend

2 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: PyData Seattle 2025 ×

Large language models are often too large to run on personal machines, requiring specialized hardware with massive memory. Quantization provides a way to shrink models, speed them up, and reduce memory usage - all while retaining most of their accuracy.

This talk introduces the fundamentals of neural network quantization, key techniques, and demonstrates how to apply them using Keras’s extensible quantization framework.