Skip to content

feat(script): add a bootstrap script for setting up Lakekeeper in the local development environment#4273

Open
mengw15 wants to merge 15 commits intoapache:mainfrom
mengw15:Lakekeeper-bootstrap-script
Open

feat(script): add a bootstrap script for setting up Lakekeeper in the local development environment#4273
mengw15 wants to merge 15 commits intoapache:mainfrom
mengw15:Lakekeeper-bootstrap-script

Conversation

@mengw15
Copy link
Copy Markdown
Contributor

@mengw15 mengw15 commented Mar 9, 2026

What changes were proposed in this PR?

Lakekeeper is a open-sourced, apache-licensed IcebergRESTCatalog implementation. This PR adds the Lakekeeper bootstrap script for developers to setup the Lakekeeper conveniently.

Lakekeeper provides pre-built binaries for macOS ARM64, Linux x86_64, and Linux ARM64
(releases). Windows is not supported by Lakekeeper upstream, so this script targets macOS and Linux developers only.

Any related issues, documentation, discussions?

Issue: close #4353 .
Documents: wiki page.

How was this PR tested?

Manually tested by launching Lakekeeper with this script and set it as the rest catalog for Texera.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Code

@mengw15 mengw15 changed the title feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - bootstrap script feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - bootstrap script Mar 9, 2026
@github-actions github-actions bot added the ddl-change Changes to the TexeraDB DDL label Mar 30, 2026
@mengw15 mengw15 self-assigned this Apr 6, 2026
Copy link
Copy Markdown
Contributor

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together! Few suggestions:

1. Remove parse-storage-config.py

The Python script + pyhocon dependency adds significant friction (not everyone has pyhocon installed, and it's a niche package). This can be eliminated entirely because storage.conf already supports env var overrides via ${?STORAGE_*} syntax. The bootstrap script can read the same env vars with the same defaults:

LAKEKEEPER_BASE_URI="${STORAGE_ICEBERG_CATALOG_REST_URI:-http://localhost:8181/catalog}"
WAREHOUSE_NAME="${STORAGE_ICEBERG_CATALOG_REST_WAREHOUSE_NAME:-texera}"
S3_REGION="${STORAGE_ICEBERG_CATALOG_REST_REGION:-us-west-2}"
S3_BUCKET="${STORAGE_ICEBERG_CATALOG_REST_S3_BUCKET:-texera-iceberg}"
S3_ENDPOINT="${STORAGE_S3_ENDPOINT:-http://localhost:9000}"
S3_USERNAME="${STORAGE_S3_AUTH_USERNAME:-texera_minio}"
S3_PASSWORD="${STORAGE_S3_AUTH_PASSWORD:-password}"

2. Remove awscli dependency for MinIO bucket operations

Requiring awscli just to create a bucket is heavy. Please use curl with S3 API directly (a simple PUT to http://endpoint/bucket-name)

3. Hardcoded binary path requires editing the script

LAKEKEEPER_BINARY_PATH="" at line 37 forces users to edit the script source. Consider:

  • Accept it as a CLI argument or env var
  • Default to lakekeeper (assuming it's on $PATH)
  • Mention brew install lakekeeper in the setup instructions (works on macOS)

4. Cross-platform considerations

Lakekeeper provides pre-built binaries for macOS ARM, Linux x86_64, and Linux ARM. Seems windows doesn't support it. Please mention it in the PR description

@mengw15 mengw15 closed this Apr 7, 2026
@mengw15 mengw15 reopened this Apr 7, 2026
@github-actions github-actions bot removed the python label Apr 8, 2026
@bobbai00 bobbai00 changed the title feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - bootstrap script feat(script): add a bootstrap script for setting up Lakekeeper in the local development environment Apr 8, 2026
Copy link
Copy Markdown
Contributor

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment

echo ""

# Step 3: Check and create MinIO bucket
echo "Step 3: Checking MinIO bucket..."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MinIO => S3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build ddl-change Changes to the TexeraDB DDL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lakekeeper bootstrap

2 participants