-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I checked the docs but don't see any types listed for 1bit – are there any plans to support binary vector types? Not only does working directly with 1-bit reduce the memory footprint (especially helpful for mobile/edge), but using binary vectors to brute force search (super fast with hamming via XOR) then rescoring with higher quality vectors is a great technique for speed + accuracy (related Hugging Face article)
I'm currently using an embedding model optimized for binary quantization via Turso (using float1bit for neighbor graph), as binary vectors are the only way that a DiskANN index is a reasonable size at scale. However, as noted in this Substack article, the indexing process for DiskANN breaks down at larger scales – insert speeds after about 250k records become unviable. Brute forcing binary vectors instead of using an index works, but unfortunately the libSQL hamming function doesn't appear to be SIMD optimized like sqlite-vec – this lack of optimization leaves an order of magnitude in performance improvement on the table.
BTW, great work on this library – happy to see more options for vectors with SQLite!