Launching Bauplan MCP Server: the First Step towards the Agentic Lakehouse
Launching Bauplan MCP Server
The First Step Towards the Agentic Lakehouse
How formal methods help ensure safe, reproducible workflows in data lakehouses
From Prompt to Pipeline: Cloud-Native Agents for Data Transformation and ETL
From notebook to prod with Bauplan and marimo
Proceedings of Workshops at the 51st International Conference on Very Large Data Bases (VLDB 2025)
Functions, Declarative Environments, Data Management And Other Things With Feathers
Bauplan is a serverless data platform that treats pipelines, models, and tables like software — versioned, testable, and ready for agents.
Announcing $7.5M seed round led by Innovation Endeavors
From legacy sprawl to lightning-fast pipelines: how Mediaset rebuilt their data stack with Bauplan and Temporal—and cut dashboard refresh times from 60 minutes to 5.
End-to-end RAG system for a conversational service agent
Build simple, robust data apps with software engineering principles.
Lessons learned crafting a Serverless Lakehouse from spare parts
Full-stack recommender system with Bauplan for data preparation and training, and MongoDB Atlas for real-time inference.
Paper presented at WoSC10 2024. In collaboration with The University of Wisconsin.
Making the experience of running data workflow in the cloud indistinguishable from doing it locally.
DAG planning using an in-memory graph database. In collaboration with Kùzu
Paper presented at DEMAI@IEEE Big Data 2024.
A reference implementation to implement a Write-Audit-Publish (WAP) pattern with Bauplan and Prefect 3.0.
Find the right balance between cost control and fast startup time for your Spark clusters.
Paper presented at SIGMOD/PODS 2024. Awarded best presentation DEEM@SIGMOD.
An open source implementation of WAP using Apache Iceberg, Lambdas, and Nessie all running entirely Python.
Working on production data is the only way to know whether our applications will work.
Why production cloud environment are too slow and hard to develop in them.
The greatest invention since sliced Virtual Machines.
Paper presented at VLDB 2023.
An open-source implementation of a Data Lake with DuckDB and AWS Lambdas.