talk-data.com talk-data.com

J

Speaker

Jacek Golebiowski

1

talks

Filtering by: PyData Berlin 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 1 activities

Search activities →
Training Specialized Language Models with Less Data: An End-to-End Practical Guide

Small Language Models (SLMs) offer an efficient and cost-effective alternative to LLMs—especially when latency, privacy, inference costs or deployment constraints matter. However, training them typically requires large labeled datasets and is time-consuming, even if it isn't your first rodeo.

This talk presents an end-to-end approach for curating high-quality synthetic data using LLMs to train domain-specific SLMs. Using a real-world use case, we’ll demonstrate how to reduce manual labeling time, cut costs, and maintain performance—making SLMs viable for production applications.

Whether you are a seasoned Machine Learning Engineer or a person just getting starting with building AI features, you will come away with the inspiration to build more performant, secure and environmentally-friendly AI systems.