Jacopo Tagliabue, CTO and founder of Bauplan, talks about building reliable data pipelines at scale using just Python and SQL, with no infrastructure management required.
Bauplan treats everything as functions, reducing pipelines, queries, and data operations to running Python functions in containers. The system features instant branching like Git but for entire datasets, allowing users to work on data changes in isolation before merging back. Bauplan achieves speed through two main principles: simplicity (everything reduces to running functions) and composability (building on existing open source components like DuckDB, Iceberg, and others). The platform can build new Docker containers in hundreds of milliseconds by caching package dependencies and using differential updates.
He also discusses the tradeoffs of composability, including the need to contribute back to open source communities and handle dependencies that may become unmaintained.
00:00 - Intro