Skip to content

feat: add Tencent GooseFS Table Master namespace implementation (Python) #121

Description

@XuQianJin-Stars

Summary

Add a new Lance Namespace implementation backed by Tencent Cloud GooseFS Table Master to the Python lance-namespace-impls package.

GooseFS Table Master is a Lance-native metadata service exposed over gRPC. Unlike general-purpose catalogs (Hive, Glue, Iceberg, Polaris, Unity), it is purpose-built for Lance — every registered table is a Lance table, so no table_type=lance marker filtering is required on the client side.

The new implementation acts as a thin pass-through layer that translates each Lance Namespace request into the corresponding *PRequest defined in the GooseFS Table Master gRPC schema and forwards it through the official goosefs-metastore-client library.

Motivation

  • Provide first-class Lance Namespace support for users running Lance on top of Tencent Cloud GooseFS / COS.
  • Cover the full Lance Namespace surface (basic CRUD + advanced operations such as indexing, tags, versioning, and transactions) without the overhead of mapping into a generic catalog model.
  • Match the integration pattern already established for Hive, Glue, Iceberg, Polaris, and Unity.

Scope

Core implementation

  • New GooseFSNamespace class registered as "goosefs" in LanceNamespaces
  • 2-level namespace hierarchy (database > table); identifier is forwarded verbatim to the Table Master
  • Configurable connection (uri / host / port / timeout / max_retries), authentication (authentication_enabled / username / impersonation_user), and namespace-level defaults (manifest_enabled / dir_listing_enabled)

Operations covered

  • Namespace: Create / List / Describe / Drop / Exists
  • Table: Create / CreateEmpty / Declare / Register / List / Describe / Drop / Deregister / Exists / Rename
  • Data plane: Insert / MergeInsert / Delete / Update / Query / CountRows
  • Schema evolution: AddColumns / AlterColumns / DropColumns / UpdateSchemaMetadata
  • Indexing: CreateIndex / CreateScalarIndex / ListIndices / DescribeIndexStats / DropIndex
  • Tags & versioning: Create/Update/Delete/List/Get tag, Create/List/Describe version, Restore, plus Batch variants
  • Transactions & query planning: AlterTransaction / DescribeTransaction / ExplainQueryPlan / AnalyzeQueryPlan / GetTableStats

Packaging & build

  • New optional extra goosefs in python/pyproject.toml (goosefs-metastore-client==0.1.7, grpcio>=1.78, grpcio-status>=1.78)
  • New Makefile targets: lint-goosefs, install-goosefs, test-goosefs, integ-test-goosefs
  • Unit tests: python/tests/test_goosefs.py
  • Integration tests: python/tests/test_goosefs_integration.py

Documentation

  • New page docs/src/goosefs.md describing background, configuration properties, object mapping, Lance-table identification, and per-operation semantics
  • docs/src/.pages updated to include Tencent GooseFS: goosefs.md in the navigation

Out of Scope

  • A Java implementation — this PR only adds the Python binding. A Java counterpart can be added in a follow-up if there is demand.
  • Bundling the GooseFS gRPC .proto files in this repo; they continue to ship inside goosefs-metastore-client.

Verification

  • ruff format --check and ruff check pass on the new files and across the whole python/ tree
  • python -m compileall succeeds for src/ and tests/
  • Unit tests in tests/test_goosefs.py pass with the goosefs extra installed
  • Integration tests in tests/test_goosefs_integration.py pass against a live GooseFS Table Master (skipped automatically when the service is not reachable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions