The programmable Lakehouse. Load, transform, query, run, schedule, replay all from your code.

The programmable Lakehouse. Load, transform, query, run, schedule, replay all from your code.

The programmable Lakehouse. Load, transform, query, run, schedule, replay all from your code.

Branch

Run

Query

Merge

Branch

Run

Query

Merge

Branch

Run

Query

Merge

Branch

Run

Query

Merge

Branch

Run

Query

Merge

Branch

Run

Query

Merge

Bring your data and code, we do the rest.

Branch. Create sandboxed branches of your data lake to develop pipelines without disrupting your production applications.

Run. Build complex SQL and Python pipelines, without dealing with containers, compute clusters and infrastructure.

Query. Run complex queries to explore data and power your data applications with the same runtime.

Merge. Integrate all your data workflows with your orchestration and CI/CD.

Data Lake version control

Instant branching of your data lake

Enable teams to develop new pipelines and create new tables, while maintaining data integrity and system performance. Move fast, don’t break things.

Make everything reproducible

Keep track of all changes in both your data and your code and program all your workflows with a few lines of Python: every issue can be reproduced, every incident can be rolled back.

Avoid lock-in

Keep your data in object storage and use Iceberg tables for seamless query engine and system integration. Your code is fully abstracted from infrastructure, eliminating the need for refactoring.

Serverless runtime

10x better developer experience

Deploy data pipelines in the cloud in seconds from code. No special skills required, no need to deal with containerization, compute provisioning and cluster configurations ever again. Just SQL and Python.

No environment management

Define containers and environment requirements directly in code for each workload function. Never worry about environment maintenance and backward compatibility.

Interactive SQL analytics

Explore data and build real-time analytics applications. Use one compute engine for for both pipelines and synchronous queries.

Join our private alpha

Join our private alpha