talk-data.com talk-data.com

Gergely Daroczi

Speaker

Gergely Daroczi

2

talks

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
Resource Monitoring and Optimization with Metaflow

Metaflow is a powerful workflow management framework for data science, but optimizing its cloud resource usage still involves guesswork. We have extended Metaflow with a lightweight resource tracking tool that automatically monitors CPU, memory, GPU, and more, then recommends the most cost-effective cloud instance type for future runs. A single line of code can save you from overprovisioned costs or painful job failures!

Benchmarking 2000+ Cloud Servers for GBM Model Training and LLM Inference Speed

Spare Cores is a Python-based, open-source, and vendor-independent ecosystem collecting, generating, and standardizing comprehensive data on cloud server pricing and performance. In our latest project, we started 2000+ server types across five cloud vendors to evaluate their suitability for serving Large Language Models from 135M to 70B parameters. We tested how efficiently models can be loaded into memory of VRAM, and measured inference speed across varying token lengths for prompt processing and text generation. The published data can help you find the optimal instance type for your LLM serving needs, and we will also share our experiences and challenges with the data collection and insights into general patterns.