Skip to content

Remove misleading LanceDB naming#34

Open
FANNG1 wants to merge 2 commits into
daft-engine:mainfrom
FANNG1:remove-lancedb-naming
Open

Remove misleading LanceDB naming#34
FANNG1 wants to merge 2 commits into
daft-engine:mainfrom
FANNG1:remove-lancedb-naming

Conversation

@FANNG1

@FANNG1 FANNG1 commented Jun 19, 2026

Copy link
Copy Markdown

Summary

  • Rename the internal Lance scan operator and Python factory helpers away from LanceDB terminology.
  • Update docstrings and examples to describe Lance datasets instead of LanceDB tables/client usage.
  • Move tests from tests/io/lancedb to tests/io/lance and rename affected test files, classes, functions, imports, and explain-output assertions.

Closes #33.

Checks

  • make check-format passed.
  • make lint passed.
  • uv run pytest tests/io/lance -v reported 278 passed, 5 skipped, 2 xfailed, 2 xpassed, but the process exited with code 139 after printing the summary.
  • uv run pytest -p no:benchmark tests/io/lance/test_lance_count_pushdown_coverage.py tests/io/lance/test_lance_factory_function.py tests/io/lance/test_lance_point_lookup.py tests/io/lance/test_lance_reads.py -v reported 68 passed, 2 skipped, 2 xpassed, but also exited with code 139 after printing the summary.
  • make typecheck failed on existing mypy issues across tests and a few existing Lance typing mismatches; no failures appeared specific to the naming change.

Notes

  • The only remaining lancedb text from rg -i "lancedb|lance db" is the factual upstream URL https://github.com/lancedb/lance/....

@FANNG1

FANNG1 commented Jun 23, 2026

Copy link
Copy Markdown
Author

@rchowell , could you help to review this PR, thanks!

@rchowell

Copy link
Copy Markdown
Contributor

Hey @FANNG1 - thanks for the PR!

Renaming classes would be breaking change, could you please re-export the existing public class names with a deprecation notice?

Thank you.

@FANNG1

FANNG1 commented Jun 25, 2026

Copy link
Copy Markdown
Author

Thanks for the review. I updated the PR to keep the existing LanceDB-named scan symbols as deprecated compatibility aliases, while the new Lance names remain the preferred ones.

Specifically, I kept these old names which in https://github.com/Eventual-Inc/Daft/blob/62cb100858c43a5a92f3bac33d6ecdcefbfa8f8d/daft/io/lance/lance_scan.py#L10-L12:

  • LanceDBScanOperator -> LanceScanOperator
  • _lancedb_table_factory_function -> _lance_table_factory_function
  • _lancedb_count_result_function -> _lance_count_result_function

Each alias now emits a DeprecationWarning and forwards to the new Lance-named implementation. I also added regression coverage for the deprecated aliases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove misleading LanceDB naming from Lance integration code

2 participants