talk-data.com talk-data.com

PyData talk 2025-12-09 at 15:00

The Lifecycle of a Jupyter Environment: From Exploration to Production-Grade Pipelines

Speakers

Description

Most data science projects start with a simple notebook—a spark of curiosity, some exploration, and a handful of promising results. But what happens when that experiment needs to grow up and go into production?

This talk follows the story of a single machine learning exploration that matures into a full-fledged ETL pipeline. We’ll walk through the practical steps and real-world challenges that come up when moving from a Jupyter notebook to something robust enough for daily use.

We’ll cover how to:

  • Set clear objectives and document the process from the beginning
  • Break messy notebook logic into modular, reusable components
  • Choose the right tools (Papermill, nbconvert, shell scripts) based on your workflow—not just the hype
  • Track environments and dependencies to make sure your project runs tomorrow the way it did today
  • Handle data integrity, schema changes, and even evolving labels as your datasets shift over time

And as a bonus: bring your results to life with interactive visualizations using tools like PyScript, Voila, and Panel + HoloViz