Skip to content

Conversation

@Gurukiran20
Copy link
Contributor

@Gurukiran20 Gurukiran20 commented Jul 20, 2025

This sets up the documentation using MkDocs for the LanceDB + Ray integration project

What it is Included:

  • Created index.md with an overview and guide links
  • Added quick-start.md for basic setup and usage
  • Added installation.md, usage.md, and contributing.md
  • Organized docs under docs/getting-started/
  • Added mkdocs.yml configuration for site structure

This setup will help the contributors and users get started quickly and understand how to use and contribute to the project.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jul 20, 2025
@github-actions
Copy link
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@Gurukiran20 Gurukiran20 changed the title docs: Add quick-start guide and restructure documentation with MkDocs docs(getting-started): add quick-start guide and restructure documentation with MkDocs Jul 20, 2025
@Gurukiran20 Gurukiran20 changed the title docs(getting-started): add quick-start guide and restructure documentation with MkDocs docs(getting-started): add quick-start guide and restructure documentation with mkdocs Jul 20, 2025
@Gurukiran20 Gurukiran20 changed the title docs(getting-started): add quick-start guide and restructure documentation with mkdocs docs(getting-started): add quick-start guide and restructure docs Jul 20, 2025
## Installation

```bash
pip install lancedb ray
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe pylance should be enough instead of lancedb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. We want to limit the integration to be with Lance as a format, instead of LanceDB as another retrieval engine. It makes much more sense for Lance to integrate with Ray instead of LanceDB.

@chenghao-guo
Copy link
Collaborator

Hi @Gurukiran20 , Thank you for your contribution!

I believe the project focuses on lance format + ray, so I believe pylance should be the requirement and lancedb currently not required for this now in the code. cc @jiaoew1991 to take a review as well.

@chenghao-guo
Copy link
Collaborator

By the way, I have accepted your previous PR (installing uv) #24, but it looks like the markdown format has some small issues which may influence the readability due to some duplication. So I have modified the markdown format based on your previous PR to make the installation guide more clear.

You may rebase this PR to the new readme.md and then this doc changes PR can be merged later.

@jiaoew1991
Copy link
Collaborator

Hi @Gurukiran20 , Thank you for your contribution!

I believe the project focuses on lance format + ray, so I believe pylance should be the requirement and lancedb currently not required for this now in the code. cc @jiaoew1991 to take a review as well.

Yes, lancedb is not required, it depends on pylance😁

@Gurukiran20
Copy link
Contributor Author

Resolved merge conflicts and finalized the dev installation section. Ready for review!

@chenghao-guo
Copy link
Collaborator

Resolved merge conflicts and finalized the dev installation section. Ready for review!

Hi @Gurukiran20 , Thank you so much for your contribution! We really appreciate you helping improve our documentation.

Regarding the description provided by your doc,

  • how to Create a vector store using LanceDB.
  • Create and manage vector stores using LanceDB
  • Use Ray to scale embedding generation and queries"

I was a bit concerned as the key point is that the project (lance-ray) is for scalable data processing with Ray and Lance, not for managing LanceDB vector stores or embeddings. could we refine the description to better reflect Lance-Ray's core functionality?

The key point is that the project (lance-ray) is for scalable data processing with Ray and Lance, but not ready for managing LanceDB vector stores or embeddings at this time. To be more specific, you can take inspiration from @jiaoew1991's readme documentation.

Distributed Lance Operations: Leverage Ray's distributed computing for Lance dataset operations
Seamless Data Conversion: Easy conversion between Ray datasets and Lance datasets
Optimized I/O: Efficient reading and writing of Lance datasets with Ray integration
Schema Validation: Automatic schema compatibility checking between Ray and Lance
Flexible Filtering: Support for complex filtering operations on distributed Lance data

Would you be open to updating the PR with something along these lines?

@Gurukiran20
Copy link
Contributor Author

Thank you so much sir for your helpful feedback and suggestions @chenghao-guo. I’ll update the PR to better reflect the project’s focus as you described. i can refer to @jiaoew1991's readme for guidance. If i have any questions while updating, I’ll let you know.
Thanks again!

docs/usage.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doc is empty?

mkdocs.yml Outdated
@@ -0,0 +1,9 @@
site_name: Lance-Ray Documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed, please look at other subprojects like lance-spark for standard mkdocs-material setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the referenced documentation? https://github.com/lancedb/lance-spark/blob/main/docs/mkdocs.yml

@Gurukiran20 Let me know if you'd like some help on this, would be great to pair on it.

docs/LICENSE.md Outdated
@@ -0,0 +1,9 @@
# License
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this page is not necessary please remove.


1. **Clone the repository:**

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary spaces for the code blocks in this file

@@ -0,0 +1,23 @@
# Development
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

development can be a part of contributing

docs/index.md Outdated
# Read the dataset back as a Ray Dataset
ds = read_lance("example.lance")

print(ds.take(3)) No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing end backquotes

@@ -0,0 +1,45 @@
# Examples

## Basic Usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure the examples in README are all covered here, this seems to be missing the advanced example, the basic one is also not as detailed. For example, the filter example does not print out "print(f"Filtered count: {filtered_ds.count()}")" as the README does.

@@ -0,0 +1,13 @@
# API Reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this page. We should build the API reference to readthedocs.

@@ -0,0 +1,25 @@
# Installation

## Basic Installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

installation should be separated to

pip install lance-ray

directly in quick start, and the remaining in contributing guide.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

installation should be separated to

pip install lance-ray

directly in quick start, and the remaining in contributing guide.

To enable this pip install lance-ray, there may be a todo issue to release in pypi. #28

@Gurukiran20
Copy link
Contributor Author

Gurukiran20 commented Jul 29, 2025

✅ Finalized MkDocs-based documentation setup!

  • Replaced outdated references in mkdocs.yml with the correct files
  • Confirmed local build using mkdocs serve working perfectly with zero warnings
  • Included: quick-start.md, examples.md, contributing.md, and index.md

Tested locally, everything looks great! Ready for final review.

cc: @chenghao-guo @jackye1995

@Gurukiran20
Copy link
Contributor Author

A special thanks to @jackye1995 your guidance, patience, and kindness throughout this contribution meant a lot to me. 🙏
You genuinely treated me with a mentor’s care (almost like a parent teaching a child!), and I learned so much from your feedback and support.
Thank you for making my major open-source documentation task such a great experience! 😊

@Gurukiran20
Copy link
Contributor Author

✅ Reverted the README.md changes as requested.

Thank you @chenghao-guo sir for the clear feedback and continued support!

cc: @chenghao-guo @jackye1995


We welcome contributions from the community!

## How to Contribute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are doing an effort to currently move all the sub-project docs to show up in the main Lance website https://lancedb.github.io/lance. What that means is that you can run the website independently here, but when we publish the website, we can copy the contents here to the Lance website in CI to build the final website. So I think it is probably better if we align the contents in the expected way.

What that means is for contributing page, we just need what is specific for this project. You can imagine that this page will show up under https://lancedb.github.io/lance/community/lance-ray, so it assumes people already have read https://lancedb.github.io/lance/community/, and there is no need to repeat whatever is already there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with that, I think we can remove everything before development setup. A document like this is sufficient:

# Contributing to lance-ray

## Development setup

Install the latest development version with all dependencies:

...

# Requirements

- Python >= 3.8

- Ray >= 2.0.0

- Lance >= 0.2.0

# Running Tests

To run all tests using [pytest](https://docs.pytest.org/):

...

To run all tests using [pytest](https://docs.pytest.org/):

```bash
pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use uv run pytest

```bash
git clone https://github.com/<your-username>/lance-ray.git
cd lance-ray
pip install -e .[dev]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv pip install -e .[dev]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this page can be removed

@Gurukiran20
Copy link
Contributor Author

Hi sir, I've updated the contributing guide with uv commands and removed the quick-start.md file as suggested.

Thanks @jackye1995 for the clear and detailed review!

@Gurukiran20
Copy link
Contributor Author

Hi sir @jackye1995 👋 Just checking in to see if there's anything else you'd like me to change on this PR. Happy to help if needed.

@@ -1,207 +1,5 @@
# Byte-compiled / optimized / DLL files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not remove gitignore contents

Copy link

@justinrmiller justinrmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great start, added a couple of comments.


- Python >= 3.8

- Ray >= 2.0.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the project requires Ray 2.40.0 or greater? May want to tighten this and other dependencies?

Reference from docs:

Ray >= 2.40.0
PyLance >= 0.30.0
lance-namespace >=0.0.5
PyArrow >= 17.0.0
Pandas >= 2.2.0
NumPy >= 2.0.0```

mkdocs.yml Outdated
@@ -0,0 +1,9 @@
site_name: Lance-Ray Documentation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the referenced documentation? https://github.com/lancedb/lance-spark/blob/main/docs/mkdocs.yml

@Gurukiran20 Let me know if you'd like some help on this, would be great to pair on it.

@Gurukiran20
Copy link
Contributor Author

Hi @jackye1995 and @justinrmiller thanks for pointing that out and offering to help! I’ve restored the .gitignore and tightened up the dependency section in contributing.md. Everything is now aligned with the existing structure.

@Gurukiran20
Copy link
Contributor Author

Hi sir @jackye1995 and @chenghao-guo Just checking in to see if there's anything else you'd like me to change on this PR

@chenghao-guo
Copy link
Collaborator

chenghao-guo commented Aug 13, 2025

Hi sir @jackye1995 and @chenghao-guo Just checking in to see if there's anything else you'd like me to change on this PR

Hi @Gurukiran20 , thank you for your contribution. I have just been back to office recently and reviewed the context. There is something to be done regarding the mkdocs.yaml.

I have cloned your code and set up with mkdocs, then compare it with the lance-spark project as @jackye1995 mentioned. I think there are probably some style difference, so you may refer lance-spark's mkdocs.yaml as @justinrmiller mentioneds on the color style and logo. For example

Current mkdocs.yaml
The color is light blue, is not so " lance blue" deep blue like other project
image
image

The quick start has a 404.
image

The existing
lance-spark project
image

, consider the lance-ray theme like

theme:
  name: material
  palette:
    - scheme: default
      primary: indigo
      accent: indigo
      toggle:
        icon: material/brightness-7
        name: Switch to dark mode
    - scheme: slate
      primary: indigo
      accent: indigo
      toggle:
        icon: material/brightness-4
        name: Switch to light mode

So it would be much better if we can sync the style and fix the some link error, so Jack shall be fine that that.

@Gurukiran20
Copy link
Contributor Author

Hi sir @chenghao-guo and @jackye1995

Thank you both for your guidance and support throughout this documentation update journey — I truly appreciate the help. 🙏

I’ve now updated the mkdocs.yaml to better match the Lance blue theme and design consistency across projects, as suggested:

  • Changed the color palette from light blue to the deeper Lance blue.

  • Synced theme settings with the lance-ray style (using indigo for primary and accent in both light and dark modes).

  • Fixed the Quick Start 404 link issue.

  • Added the GitHub button in the header.

  • Included social links (Discord, Twitter, GitHub).

  • Updated the website logo and browser tab icon.

Please let me know if there’s anything else you’d like me to add or adjust

@chenghao-guo
Copy link
Collaborator

Hi sir @chenghao-guo and @jackye1995

Thank you both for your guidance and support throughout this documentation update journey — I truly appreciate the help. 🙏

I’ve now updated the mkdocs.yaml to better match the Lance blue theme and design consistency across projects, as suggested:

  • Changed the color palette from light blue to the deeper Lance blue.
  • Synced theme settings with the lance-ray style (using indigo for primary and accent in both light and dark modes).
  • Fixed the Quick Start 404 link issue.
  • Added the GitHub button in the header.
  • Included social links (Discord, Twitter, GitHub).
  • Updated the website logo and browser tab icon.

Please let me know if there’s anything else you’d like me to add or adjust

  1. missing a requirement.txt like mkdocs-material ?
image
  1. logo seems missing in my side
image

@jackye1995
Copy link
Contributor

Thanks for your work! I incorporated your change in #32 and marked you as coauthor.

@jackye1995 jackye1995 closed this Aug 25, 2025
@Gurukiran20
Copy link
Contributor Author

Thank you so much, sir🙏 It truly means a lot to be acknowledged as a co-author on the commit.
As I’m a final year B.Tech student in AI & ML and also interning at Cydra Tech. Over the past two weeks, I had exams, internship Work presentations, and assignments, so I couldn’t contribute actively. From next week onwards, I’ll continue contributing to LanceDB with full focus.

Since July (past 1.5 months), this journey with LanceDB has been an incredible learning experience for me as a fresher. @chenghao-guo and @jackye1995 your guidance has helped me a lot 🙌 and I sincerely thank you both from the bottom of my heart🙏 .

@Gurukiran20
Copy link
Contributor Author

Hi @jackye1995, @chenghao-guo , and @justinrmiller Good morning

It’s been a while since I last contributed, and I’m excited to be active again in the Lance-Ray project. I’m grateful that my documentation work from #25 was incorporated into #32, especially commits like 2e37d65 (“Add mkdocs integration”) and c181e6e (“use uv for doc”), which reflected the feedback we iterated on together.

However, I noticed that while @jackye1995 mentioned “marked you as coauthor” when closing #25, my name/email (Gurukiran20 gurusm1443@gmail.com
) doesn’t appear in those commits and instead shows up on 4e5d773 (PyPI workflow), which I didn’t work on. Since the PR was closed (not merged), my contribution also isn’t visible under my profile’s contribution history.

Would you be open to amending 2e37d65 or c181e6e with:
Co-authored-by: Gurukiran20 gurusm1443@gmail.com

This would ensure proper attribution in Git history and visibility in my open-source profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants