Fireside chat: Rethinking the Semantic Layer | June 16

Fireside chat: Rethinking the Semantic Layer- The Builders Response | June 16 | 9am PT

How Data Teams Use Bauplan and Claude to Answer Stakeholder Requests Without Losing Governance

An automated loop from executive question to reviewed pipeline, using Bauplan, Claude and Linear.
Carlos Leyson
Giacomo Piccinini
Jun 23, 2026

The Problem

A non-technical stakeholder asks a sensible question about company metrics. Someone on the data team picks it up, runs a query on the lakehouse, returns the answer, and notes down the query somewhere.

This "fire and forget" approach is detrimental to both sides. For the business, it means every question has to go through a person and wait in a queue. For the data team, it means recalling some business definition, tracking down a query snippet saved somewhere in a note, and reconstructing some previous work.

With AI agents now widespread, we can let executives ask questions directly to the AI. Right?

Why Agents Make This Worse

Using tools like Claude Code as agentic data analysts has become a common pattern whenever non-technical users want to interact directly with the data.

If this new possibility eases the pressure on the data team, it also opens the door to other problems, such as poor reproducibility, wrong logic or absence of testing.

The situation is further exacerbated in large organizations, where three failure modes typically kick in:

  • Ephemeral answers: queries live in someone's Claude Desktop chat and are throw-away in nature. If the same question comes up again, it will get answered from scratch and with no guarantee on interpretation drift or implementation bugs.
  • Diverging pipelines: when someone on the data team eventually builds a pipeline, there is no guarantee it matches what the agent computed. The same metric gets defined differently in different pipelines, and the lakehouse accumulates contradictory sources of truth.
  • Pricing for humans: traditional lakehouses charge per query and were designed around human usage patterns. When agents run queries freely on behalf of every employee, the volume and the bill explode.

The Risk is Drift

“Drift” is what happens when the same question gets a different answer depending on when it was asked, who asked it, or which agent handled it.

For ephemeral answers, drift is at the query level: the logic is not saved anywhere, so nothing prevents the next run from using a different interpretation.

Drift operates also at the pipeline level, where different definitions of the same metric end up living in the same lakehouse.

These two instances of drift are not new per se, and are definitely not tied to the AI. The real difference between the pre- and post-AI world is in the lack of friction: with agents, anyone can make a query or write a pipeline and the sheer volume of code simply outpaces any human review process.

On top of that, there is a more structural “drift” in pricing: usage patterns that agents generate are orders-of-magnitude different from human usage patterns, and a platform priced for humans will misbehave (not just in the monthly bill!) when agents are in the loop.

Overall, the common thread here is the absence of a mechanism that enforces consistency. That mechanism needs to work at two levels: on the one hand, rules that define what correct looks like; on the other, a platform with the primitives to enforce them.

Rules & Primitives

The answer to drift isn’t just a better platform, and it’s not a simple list of written rules either.

Agents need context about what already exists, such as the current model graph, naming conventions, business definitions and architecture.

Once the rules are written, a platform with the right building blocks and primitives is needed. But what are these right primitives?

  1. Data branching: In traditional software engineering, no one writes directly to production. This hard-learned lesson should equally apply to data (and agentic) workflows. With data branching, agents can be let loose in a sandboxed environment without the risk of corrupting the main branch. Until a human decides otherwise, that is.
  2. Data lineage as code lineage: Data should be fully reproducible from code. When code and data are in a 1:1 correspondence, lineage follows naturally: you know exactly which code produced which table.
  3. Testing: Data needs to be tested just as code and, critically, before it reaches production. A branching system allows you to carry out extensive tests in CI/CD pipelines and only merge when data quality is assured.

Bauplan provides all three. What follows is what this looks like in a realistic workflow.

The Workflow: Executive Questions via Claude Desktop and the Bauplan MCP

In this workflow we recreate the scenario of an executive asking questions about the company data through a chat interface (for instance, Claude Desktop). The executive never sees the lakehouse, the pipelines, or the SQL underneath: it's up to the AI agent to fetch the right data and craft an answer with tables and charts.

Now, every question resolves to one of two paths:

  1. The answer already exists as a table. The agent queries it and returns the result.
  2. The answer does not exist as a table yet. The agent computes it on the fly via an arbitrarily complex query (joining, aggregating, transforming) and returns the result promptly: we don't want the executive waiting for a data person to review the question! The agent then files a Linear issue recording the question, the query and the answer. A second downstream agent picks it up, converts it into a full-fledged pipeline and opens up a pull request for a human to review: once validated, that result becomes a permanent table.

The second path is what keeps answers consistent across executives over time: once a reviewed pipeline answers a question, the table it produces becomes the canonical source for that question.

The Setup

The setup is designed to ensure executives never deal with code while sharing a common knowledge base with developers and agents. By knowledge base we mean a collection of documents at different levels of technicality: business-oriented semantics, high-level lakehouse organization, down to the code implementing the pipelines.

The point of contact between these business and technical worlds is a GitHub repository. The executives need not know what GitHub is or how to use it: they create a project in Claude or ChatGPT and add the repository to its files.

The repository contains a mixture of technical and business knowledge. Alongside the code, the following documents provide context to the agents:

  • semantics.md: definitions of what the data means, the metrics that matter, and the shared vocabulary that must be agreed on before writing any query.
  • lakehouse.md: the medallion architecture, bronze, silver and gold layers, naming conventions, and the rules a pipeline must follow.
  • workflow.md: the end-to-end flow from question to answer, including the query-versus-build decision and how human review gates each new table.
  • answering.md: instructions for how the agent should communicate with the executive and structure its answers.
  • linear.md: instructions for how to file a Linear issue when no existing table answers the question, including what information to include and how the downstream implementation agent should interpret it.
  • CLAUDE.md: operational rules for any agent working in this repository, covering git workflow, Bauplan safety rules, uv usage, and CLI versus SDK decisions.

The agent in the executive's chat is further guided by a system prompt like this

You are a data assistant. Your job is to answer business questions about company data for a non-technical executive, in this chat.

Start with answering.md, which governs how you respond. How an answer gets produced is in workflow.md. When filing an issue, consult linear.md for instructions.

Use semantics.md to map a question to what the data means, and lakehouse.md for how tables are layered. Operational rules are in CLAUDE.md.

This setup comes with three advantages:

  1. The executive's agent is always up to date with the latest state of the code: it does not need to guess;
  2. Implementation details are available if necessary: if the question is ambiguous, the code and the rest of the documentation can help the agent ground itself in what's already in the lakehouse;
  3. Documents can be versioned, too. Since everything lives in GitHub, one can version and track not just code, but also semantics;

Talking to the Lakehouse: The Bauplan MCP Server

The agent in the executive chat talks to the Bauplan lakehouse through the Bauplan MCP server. An MCP server is the simplest way to let an agent interact with Bauplan without requiring complex technical setup from the executive or their IT team.

An additional practical advantage is that Claude Desktop already includes visualization capabilities, meaning that there is no need to build separate rendering code to display an answer as a pie chart or a line chart.

From Question to Permanent Table

Here is the full loop in practice:

  1. The agent answers the executive's question using the MCP server. It then opens a Linear issue containing at minimum: the question it was asked, the query it ran, and the result it returned. These three give the downstream implementation agent full visibility into the use case, along with a way to verify that the query result matches the new materialization.
  2. Linear is synced with GitHub, so the issue is mirrored as a GitHub issue. A GitHub Action picks it up, spawns a Claude Code instance on a GitHub runner, and has it implement the pipeline. This agent works on its own code branch and its own Bauplan data branch, so the main branches of both the repository and the lakehouse remain completely unaffected.
  3. Once the implementation is complete and the result verified against the original query output, a pull request is automatically opened and assigned to the data team. They validate the implementation and merge both the code and the table into main.

The next time an executive asks about revenue by region, the answer comes from a tested, reviewed, versioned pipeline.

Share on

More From Our Blog

Love Python and Go development, serverless runtimes, data lakes and Apache Iceberg, and superb DevEx? We do too! Subscribe to our newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.