Data teams know the pain of moving from proof-of-concepts to production. We’ve all seen brittle scripts, one-off notebooks, and manual fixes turn into hidden risks. With large language models, the same story is playing out, unless we borrow the lessons of modern data engineering.
This talk introduces a declarative approach to LLM engineering using DSPy and Dagster. DSPy treats prompts, retrieval strategies, and evaluation metrics as first-class, composable building blocks. Instead of tweaking text by hand, you declare the behavior you want, and DSPy optimizes and tunes the pipeline for you. Dagster is built on a similar premise; with Dagster Components, you can build modular and declarative pipelines.
This approach means:
- Trust & auditability: Every LLM output can be traced back through a reproducible graph.
- Safety in production: Automated evaluation loops catch drift and regressions before they matter.
- Scalable experimentation: The same declarative spec can power quick tests or robust, HIPAA/GxP-grade pipelines.
By treating LLM workflows like data pipelines: declarative, observable, and orchestrate, we can avoid the prompt spaghetti trap and build AI systems that meet the same reliability bar as the rest of the stack.