Abstract: The talk introduces Any Compression via Iterative Pruning (ACIP), a novel approach designed to give users intuitive control over the compression-performance trade-off. ACIP uses a single gradient descent run of iterative pruning to establish a global parameter ranking, enabling immediate materialization of models of any target size. It demonstrates strong predictive performance on downstream tasks without costly fine-tuning and achieves state-of-the-art compression for open-weight LLMs, often complementing common quantization techniques.
talk-data.com
D
Speaker
Dr. Martin Genzel
1
talks
Senior Research Engineer
Merantix Momentum
Dr. Martin Genzel is a Senior Research Engineer at Merantix Momentum and will discuss methods for compressing foundation models.
Bio from: #16: Compressing Foundation Models as Easy as Image Compression? by M. Genzel
Filtering by:
#16: Compressing Foundation Models as Easy as Image Compression? by M. Genzel
×
Filter by Event / Source
Talks & appearances
Showing 1 of 1 activities