Skip to content

fix(store): preserve embedding on metadata-only updates#25

Merged
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:pr/metadata-preserve
May 22, 2026
Merged

fix(store): preserve embedding on metadata-only updates#25
yhyyz merged 1 commit into
ourmem:mainfrom
doctatortot:pr/metadata-preserve

Conversation

@doctatortot

@doctatortot doctatortot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Problem

LanceStore::update builds its row via memory_to_batch(memory, vector), and memory_to_batch writes a zero vector when vector is None:

None => vec![0.0; self.vector_dim as usize],

The update handler passes vector = None for any edit that does not change content (tags, state, supersede, …). Combined with merge_insert(...).when_matched_update_all(None), this silently overwrites the row's real embedding with zeros. The memory still exists and reads back fine over the API, but its vector is gone — so it is invisible to vector search, with no error and an HTTP 200.

In practice this wipes embeddings on exactly the memories edited most often (tag/state churn), and the degradation is invisible until you notice recall has quietly rotted.

Fix

When vector is None, reuse the row's existing embedding via get_vector_by_id instead of letting memory_to_batch zero it.

Test

test_metadata_only_update_preserves_vector: create with a vector → metadata-only update (vector = None) → assert the stored vector is unchanged (and non-zero).

update() passed vector=None for metadata-only changes (state, tags, supersede-state). memory_to_batch then wrote a zero vector which merge_insert.when_matched_update_all() committed, silently wiping the row's embedding and making refined memories invisible to vector search. Reuse the stored vector via get_vector_by_id when no new embedding is supplied. Adds a regression test (test_metadata_only_update_preserves_vector).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@doctatortot

Copy link
Copy Markdown
Contributor Author

CI here is red only because of pre-existing rustfmt drift on main (the fmt step fails before build/clippy/test can run). The files changed in this PR are fmt-clean. #27 fixes that drift — once it merges, a rebase onto main turns this green.

@yhyyz yhyyz merged commit 398e8a2 into ourmem:main May 22, 2026
1 check failed
@yhyyz

yhyyz commented May 22, 2026

Copy link
Copy Markdown
Contributor

Merged and deployed — great catch! This was a serious silent data corruption bug introduced by the merge_insert migration. Metadata-only updates were quietly wiping embeddings. Now fixed. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants