Skip to content

feat: support zonemap indexes in ALTER TABLE CREATE INDEX#466

Closed
beinan wants to merge 3 commits into
lance-format:mainfrom
beinan:user/beinan/zonemap-create-index
Closed

feat: support zonemap indexes in ALTER TABLE CREATE INDEX#466
beinan wants to merge 3 commits into
lance-format:mainfrom
beinan:user/beinan/zonemap-create-index

Conversation

@beinan

@beinan beinan commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add single-column zonemap support for ALTER TABLE ... CREATE INDEX via Lance direct createIndex instead of fragment training
  • surface zonemap indexes in SHOW INDEXES and scan planning, and fix numeric zonemap pruning across mixed numeric types
  • clarify and test that multi-column zonemap is not yet supported, with a clearer Spark-side error and updated docs

Testing

  • ./mvnw -pl lance-spark-4.0_2.13,lance-spark-4.1_2.13 -Dtest=AddIndexTest,ShowIndexesTest,ZonemapFragmentPrunerTest,CreateIndexStandardSyntaxTest -Dsurefire.failIfNoSpecifiedTests=false test
  • ./mvnw -pl lance-spark-4.0_2.13,lance-spark-4.1_2.13 -Dtest=AddIndexTest,ShowIndexesTest -Dsurefire.failIfNoSpecifiedTests=false test

Notes

  • generic Spark CREATE INDEX syntax accepts a column list, but current Lance core rejects multi-column zonemap creation with LanceError(Index): Only support building index on 1 column at the moment
  • Spark now fails earlier with Zonemap index currently supports a single column only
  • Python integration tests were updated but not run locally because this environment does not have pytest or the expected /home/lance/data fixture path

@github-actions github-actions Bot added the enhancement New feature or request label Apr 21, 2026
@beinan beinan marked this pull request as ready for review April 21, 2026 22:34
@beinan

beinan commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

Closing in favor of #473 which contains all these changes plus the distributed build work.

@beinan beinan closed this May 11, 2026
hamersaw pushed a commit that referenced this pull request Jun 4, 2026
…516)

## Summary

- Add zonemap as a new index type in `CREATE INDEX` DDL with distributed
build support
- Batch fragments into configurable segments via `num_segments` option
(defaults to `spark.default.parallelism`)
- Each segment is built in parallel on Spark executors and committed as
a logical index on the driver
- Zonemap indexes currently support single column only

## What Changed

- `AddIndexExec.scala`: Zonemap-specific path with
`ZonemapIndexJob`/`ZonemapIndexTask` and `commitIndexSegments`
- `create-index.md`: Document zonemap index type, options, and usage
- Tests: unit tests for segment creation/validation and integration test

## Notes

- Rebased cleanly onto current `main`
- Depends on lance-core `7.0.0-beta.10` or newer which includes zonemap
segment support
- Supersedes PR #473 and closed PR #466

## Test plan

- [x] CI passes (lint, unit tests, integration tests across all
Spark/Scala versions)
- [x] Zonemap index creation with default segment count
- [x] Zonemap index creation with explicit `num_segments`
- [x] Repeated zonemap index creation replaces existing segments
- [x] Query correctness after zonemap index creation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Beinan Wang <beinanwang@microsoft.com>
ivscheianu pushed a commit to ivscheianu/lance-spark that referenced this pull request Jun 12, 2026
…ance-format#516)

## Summary

- Add zonemap as a new index type in `CREATE INDEX` DDL with distributed
build support
- Batch fragments into configurable segments via `num_segments` option
(defaults to `spark.default.parallelism`)
- Each segment is built in parallel on Spark executors and committed as
a logical index on the driver
- Zonemap indexes currently support single column only

## What Changed

- `AddIndexExec.scala`: Zonemap-specific path with
`ZonemapIndexJob`/`ZonemapIndexTask` and `commitIndexSegments`
- `create-index.md`: Document zonemap index type, options, and usage
- Tests: unit tests for segment creation/validation and integration test

## Notes

- Rebased cleanly onto current `main`
- Depends on lance-core `7.0.0-beta.10` or newer which includes zonemap
segment support
- Supersedes PR lance-format#473 and closed PR lance-format#466

## Test plan

- [x] CI passes (lint, unit tests, integration tests across all
Spark/Scala versions)
- [x] Zonemap index creation with default segment count
- [x] Zonemap index creation with explicit `num_segments`
- [x] Repeated zonemap index creation replaces existing segments
- [x] Query correctness after zonemap index creation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Beinan Wang <beinanwang@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant