[2025 - Day 1 - Data Science & Algos] Ciro Greco shares insights from designing reproducible data workloads over Data Lakes, exploring how to decouple code, compute, and data management for deterministic pipeline reproduction. For engineers developing Python data pipelines or debugging complex workflows, this talk offers valuable perspectives on leveraging open-source components like Iceberg, Arrow, and Docker to create declarative functional DAGs with efficient cloud execution.
ABOUT THE SPEAKER:Ciro Greco, Founder, Bauplan -
00:00 - Intro