talk-data.com talk-data.com

M

Speaker

Michał Szołucha

1

talks

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Parallel PyTorch Inference with Python Free-Threading

This talk examines multi-threaded parallel inference on PyTorch models using the new No-GIL, free-threaded version of Python. Using a simple 124M parameter GPT2 model that we train from scratch, we explore the novel new territory unlocked by free-threaded Python: parallel PyTorch model inference, where multiple threads, unimpeded by the Python GIL, attempt to generate text from a transformer-based model in parallel.