talk-data.com talk-data.com

Meetup talk 2025-10-23 at 18:50

Accelerating Neural Networks Through Quantization in PyTorch for Different Devices

Description

We design and apply quantization algorithms for PyTorch DNNs across modern architectures, using PyTorch internals mechanisms to automatically balance quality and speed. We then compile the quantized checkpoints to deliver real-world speedup on different hardware.