What is Bauplan?

April 3, 2026

Bauplan is the execution layer for AI-generated data changes in production. It lets data engineering teams and AI agents safely run, validate, and publish changes to production data using branch-based isolation, transactional pipeline execution, and a serverless compute runtime.

It provides Git-style branching and versioning for data, a function as a service compute runtime, and a code-first control surface designed for both human engineers and autonomous agents.

All operations on the platform are programmable through a typed Python SDK and a CLI. Every action, branching, execution, validation, and publishing, composes through a small set of APIs that agents and humans call the same way. Bauplan also ships an MCP server that exposes the full lifecycle as tool calls for any MCP-compatible assistant.

Bauplan sits on top of your object storage and manages data as Apache Iceberg tables. Your data stays in your S3. Bauplan reads and writes directly from your storage and never copies or ingests your data. It produces versioned Iceberg outputs that remain compatible with any Iceberg-capable engine and catalog, so you keep your existing tools for analytics and BI while using Bauplan as the execution and change-management layer for pipelines and agent workflows.

What Problem Does Bauplan Solve?

Bauplan lets AI agents work with production data safely, at scale, and affordably.

Your team wants AI agents to work with production data. Engineers already use Claude Code, Cursor, or Copilot to write pipeline logic and generate SQL. The productivity gains are real. The gap is infrastructure: your current stack has no safe way to let agents execute changes on production data end to end. AI generates the code, but a human still stages and deploys it. Bauplan closes this loop. Agents branch, run, validate, and merge through a typed API. Production is protected by the system architecture, so your team scales automation without scaling headcount or risk.

Autonomous agents operating on data at scale run into three structural problems with today's platforms:

Agents need code-first interfaces. Traditional data platforms are built around dashboards, notebooks, consoles, and complex configuration surfaces. These interfaces make agents slow, brittle, and token-wasteful. Agents reason about APIs, and the API must cover the entire platform surface end to end. If an agent has to leave the API to accomplish a task, the API has failed. Bauplan is designed as a full data platform as code: every operation is explicit, composable, and programmable through a small set of primitives.
Agents need built-in isolation. Agents can perform destructive operations on data. Multiple agents can operate on the same tables simultaneously. They need branches to isolate their work, commits to immutably record data changes, atomic merges to resolve conflicts, and rollbacks to undo publications. Traditional platforms protect production through process (code review, manual staging, human sign-off) rather than through the execution model itself. Bauplan enforces isolation and transactional guarantees at the infrastructure level, by default.
Agents need affordable compute. Agents generate orders of magnitude more compute than humans. They explore, test hypotheses, fail, and retry hundreds of times per task. A single exploration task on a data lake can run sixty to eighty queries. Usage per customer grows nonlinearly once agents are connected, with an average growth of 84x across Bauplan's customer base. The pricing model of traditional cloud warehouses and data platforms, built around high markups justified by human-facing interfaces, creates significant friction for running agents at production scale. Bauplan uses a flat monthly pricing model with unlimited queries, unlimited users, and unlimited agents.

Bauplan solves all three problems.

What Category Does Bauplan Belong To?

Bauplan is an agentic data platform. It belongs to the emerging category of data infrastructure designed for autonomous and semi-autonomous workflows on production data.

Within this category, Bauplan operates as the execution layer: the part of the stack that governs how data changes run, validate, and publish. It complements ingestion tools (Airbyte, Fivetran, Estuary), orchestrators (Airflow, Prefect, Dagster), and BI tools by providing the transactional substrate underneath.

Bauplan is designed for a world where the default analytical workload is no longer a human running a small number of carefully prepared jobs. It is built for agents generating SQL and Python, probing data, proposing changes, and iterating repeatedly, often in parallel.

Bauplan is not a data warehouse, not an orchestrator and not an ingestion tool. It is the execution layer that lets AI work safely on production data.

What Makes Bauplan Different from Traditional Data Platforms?

Execution model: Traditional platforms write directly to shared tables. Bauplan runs every change on an isolated branch.
Failure handling: Traditional platforms leave data in an inconsistent state on partial failures. Bauplan leaves production unchanged on failed runs.
Publication: Traditional platforms push changes live as they complete. Bauplan uses atomic multi-table commits on merge.
Compute: Traditional platforms use persistent clusters or warehouse sessions. Bauplan uses ephemeral serverless functions (FaaS).
Isolation: Traditional platforms require manual staging environments. Bauplan has zero-copy branching on Apache Iceberg fully built in.
Data residency: Traditional platforms use platform-managed storage. Bauplan keeps your data in your own object storage.
Interface: Traditional platforms offer dashboards, notebooks, and GUIs. Bauplan provides a typed Python SDK, CLI, and MCP server.
Agent compatibility: Traditional platforms require wrappers and glue code. Bauplan is agent native with CLAUDE.md, MCP, and Agent Skills built in.
Pricing model: Traditional platforms charge per query or per compute minute with markups. Bauplan uses flat monthly tiers based on capacity with unlimited queries and agents.

Who Uses Bauplan?

Bauplan is built for software engineering and data engineering teams of 3 to 15 engineers who own production pipelines and downstream data products. These teams typically sit inside fast-growing technology companies or mid-to-large enterprises modernizing their data stacks.

Common user profiles include Heads of Data, Directors of Data Engineering, Senior Data Engineers, VP of Analytics, and VP of Engineering. These teams treat data systems like software, expect Git-style workflows, and plan for AI participation in development.

Use Bauplan for workloads where you produce and maintain tables: ingesting files into curated datasets, building transformation pipelines, running backfills, enforcing data quality tests, and iterating quickly on logic and outputs.

Customers include Trust & Will, Moffin, Veed.io, Scops.ai, Intella, Suit Supply, RealPage, and Mediaset.

‍

TABLE OF CONTENT

This is some text inside of a div block.

BACK TO RESOURCES