talk-data.com
PyData
talk
2025-09-02 at 11:40
Data science in containers: the good, the bad, and the ugly
Event:
PyData Berlin 2025
Speakers
Topics
Description
If we want to run data science workloads (e.g. using Tensorflow, PyTorch, and others) in containers (for local development or production on Kubernetes), we need to build container images. Doing that with a Dockerfile is fairly straightforward, but is it the best method? In this talk, we'll take a well-known speech-to-text model (Whisper) and show various ways to run it in containers, comparing the outcomes in terms of image size and build time.