Affected Component
Python API
Current Behavior
Hi. I don't know for sure if this is an enhancement, or feature request. I searched the documentation over and over and I still don't know how to export a whole DB into something like JSON lines.
Why? Well, I am running different processes on different machines to calculate statistics and embeddings and when they are finished, I want to join all the Zvec DBs into one. So maybe that's the main feature, joining multiple DBs?
But exporting as JSON will allow better compatibility with any other tool from any other programming language.
The error when I'm trying to export a whole DB:
File ".venv/lib/python3.14/site-packages/zvec/executor/query_executor.py", line 236, in execute
docs = self._do_execute(query_vectors, collection)
File ".venv/lib/python3.14/site-packages/zvec/executor/query_executor.py", line 189, in _do_execute
docs = collection.Query(query)
ValueError: query validate failed: topk[10000000] is too large, max is 1024
I don't know what entries I Have in there, so I can either fetch them one by one. The DBs are big, tens of millions of entries.
I know you are using RocksDB somewhere in there, so it should be easy to expose some lazy iter() function for keys + values, that should solve the problem.
Desired Improvement
Export feature/ Merge DBs/ Lazy iter
Impact
Affected Component
Python API
Current Behavior
Hi. I don't know for sure if this is an enhancement, or feature request. I searched the documentation over and over and I still don't know how to export a whole DB into something like JSON lines.
Why? Well, I am running different processes on different machines to calculate statistics and embeddings and when they are finished, I want to join all the Zvec DBs into one. So maybe that's the main feature, joining multiple DBs?
But exporting as JSON will allow better compatibility with any other tool from any other programming language.
The error when I'm trying to export a whole DB:
I don't know what entries I Have in there, so I can either fetch them one by one. The DBs are big, tens of millions of entries.
I know you are using RocksDB somewhere in there, so it should be easy to expose some lazy iter() function for keys + values, that should solve the problem.
Desired Improvement
Export feature/ Merge DBs/ Lazy iter
Impact