talk-data.com talk-data.com

Gergely Daroczi

Speaker

Gergely Daroczi

1

talks

Filtering by: PyData Berlin 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 2 activities

Search activities →
Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

Spare Cores is a Python-based, open-source, and vendor-independent ecosystem collecting, generating, and standardizing comprehensive data on cloud server pricing and performance. In our latest project, we started 2000+ server types across five cloud vendors to evaluate their suitability for serving Large Language Models from 135M to 70B parameters. We tested how efficiently models can be loaded into memory of VRAM, and measured inference speed across varying token lengths for prompt processing and text generation. The published data can help you find the optimal instance type for your LLM serving needs, and we will also share our experiences and challenges with the data collection and insights into general patterns.