Skip to content

[upstream PR 228] feat(vector): auto-index observations into vector store | 向量索引自动集成 #795

@wbugitlab1

Description

@wbugitlab1

Source: Source pull request number: 228 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat(vector): auto-index observations into vector store | 向量索引自动集成
Author: mechanic-Q
State: open
Draft: no
Merged: no
Head: mechanic-Q/agentmemory:feature/vector-auto-index @ e6a34d0
Base: main @ 1c8713f
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-05-02T07:26:41Z
Updated: 2026-05-17T09:42:42Z
Closed: (not closed)
Merged at: (not merged)

Original PR body:

Summary | 概述

Automatically add compressed observations to the VectorIndex during the observe lifecycle, enabling hybrid BM25+Vector search without manual index building.

在观察生命周期中自动将压缩记忆添加到向量索引,实现 BM25+向量混合搜索。

Motivation | 动机

The VectorIndex and EmbeddingProvider infrastructure exists but observations were only added to the BM25 index. The vector index remained empty unless manually populated. This meant the hybrid search (BM25 + Vector RRF fusion) couldn't leverage semantic similarity — only BM25 keyword matching was active. For non-English content (Chinese, multilingual), BM25 alone is insufficient because the tokenizer can't handle CJK text well. Vector search bridges this gap.

向量索引基础设施已存在,但观察结果仅被添加到 BM25 索引,向量索引保持为空。混合搜索无法利用语义相似度,仅靠 BM25 关键词匹配。对于中文等非英文内容,纯 BM25 效果差,向量搜索可以弥补这一差距。

Changes | 改动

  • src/functions/observe.ts: After synthetic compression and BM25 indexing, auto-embed the narrative and add to vector index
  • src/index.ts: Pass vectorIndex and embeddingProvider to registerObserveFunction
  • Auto-indexing runs within try-catch — embedding failures are logged but don't block observation capture
  • Only activates when an embedding provider is configured (opt-in by setting AGENTMEMORY_LOCAL_EMBEDDING_MODEL or cloud API keys)

Combined with PR #223 (configurable embedding)

# Enable multilingual hybrid search (BM25 + Vector)
AGENTMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/bge-m3

This combination gives agentmemory full Chinese/multilingual semantic search capability — BM25 handles exact terms, vector handles meaning.

Backwards Compatibility | 向后兼容

New params are optional. When vectorIndex or embeddingProvider is null/undefined (default), the auto-indexing is skipped. No behavior change for existing deployments.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added optional vector indexing and embedding support for observations, enabling automatic metadata indexing when configured
    • Enhanced with external control over vector auto-indexing functionality
    • Improved resilience through graceful error handling that prevents vector operation failures from disrupting observation workflows

Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    decision-candidateFork decision has not been madeupstream-openUpstream pull request is openupstream-prTracks an upstream pull request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions