Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 19 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ But here's what you face:

You don't need a hammer when you need a magnifying glass.

---

## What Cloudfloe Does

Expand All @@ -35,7 +34,6 @@ Cloudfloe is a lightweight, browser-based SQL interface for Apache Iceberg data

Think of it as a web-based scratchpad for your Iceberg data lake.

---

## Features

Expand All @@ -48,7 +46,6 @@ Think of it as a web-based scratchpad for your Iceberg data lake.
| **Query Stats** | Execution time, bytes scanned, rows returned |
| **Docker Ready** | One command to run locally |

---

## Quick Start

Expand Down Expand Up @@ -97,7 +94,6 @@ SELECT * FROM iceberg_scan('s3://your-bucket/warehouse/db/table_name') LIMIT 10;

Click **Run Query** to see your data.

---

## Query Examples

Expand Down Expand Up @@ -135,7 +131,6 @@ SELECT * FROM iceberg_snapshots('s3://bucket/warehouse/db/table_name');
SELECT * FROM iceberg_metadata('s3://bucket/warehouse/db/table_name');
```

---

## S3 Access Setup

Expand Down Expand Up @@ -173,7 +168,25 @@ aws s3 cp s3://your-bucket/warehouse/db/table_name/metadata/version-hint.text -

If these work, Cloudfloe will too.

---

## Credential Model

**User query credentials** are sent per-request in the API body, applied to a short-lived in-memory DuckDB session, and discarded when the connection closes. They are not:
- read from environment variables
- written to disk
- logged
- stored in a database

That means a `docker inspect` on the backend container will not surface the S3 keys a user is querying with.

**For self-hosted deployments**, we recommend the same discipline for any service-level credentials you add (e.g. upstream databases, metadata stores):

- **Docker Swarm / Compose**: use [Docker secrets](https://docs.docker.com/engine/swarm/secrets/) with the `*_FILE` env var convention. Most upstream images (MinIO, Postgres, etc.) support it natively.
- **Kubernetes**: mount a `Secret` as a file under `/run/secrets/` or a writable config dir.
- **AWS**: prefer IAM roles for the compute layer (EC2 instance profile, ECS task role, EKS IRSA) over baking AKIAs into env vars.

The bundled `docker-compose.yml` uses plain env vars for the demo MinIO (public credentials `cloudfloe` / `cloudfloe123`) — not because that's the right pattern for real data, but to keep `docker compose up` a single command. Don't copy the demo pattern for production storage.


## Limitations

Expand All @@ -192,7 +205,6 @@ If these work, Cloudfloe will too.

If your table has deletes, compact it first using Spark, Trino, or the Iceberg CLI before querying with Cloudfloe.

---

## Troubleshooting

Expand Down Expand Up @@ -225,7 +237,6 @@ The probe couldn't read any Iceberg metadata at the path. Most common causes, ro

Cloudfloe is read-only by design — the backend parses every query and rejects anything that isn't a single SELECT/WITH/UNION/VALUES statement. Rewrite the query as a SELECT, or use Spark / Trino / DuckDB CLI directly for write workloads.

---

## Architecture

Expand All @@ -248,7 +259,6 @@ Cloudfloe is read-only by design — the backend parses every query and rejects
+-----------------+
```

---

## Local Development

Expand Down
8 changes: 4 additions & 4 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ services:
container_name: cloudfloe-backend
ports:
- "8000:8000"
environment:
- MINIO_ENDPOINT=minio:9000
- MINIO_ACCESS_KEY=cloudfloe
- MINIO_SECRET_KEY=cloudfloe123
# Query credentials arrive per-request in the API body and are only
# applied to a short-lived in-memory DuckDB session — they are never
# sourced from env vars or written to disk. See README → "Credential
# model" for the self-hosting pattern.
depends_on:
minio:
condition: service_healthy
Expand Down
Loading