[upstream PR 228] feat(vector): auto-index observations into vector store | 向量索引自动集成

Source: Source pull request number: 228 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat(vector): auto-index observations into vector store | 向量索引自动集成
Author: mechanic-Q
State: open
Draft: no
Merged: no
Head: mechanic-Q/agentmemory:feature/vector-auto-index @ e6a34d0d8902f18d5dce525678ee862bc1089a68
Base: main @ 1c8713f97813ea17b2dd4517f078d5eec8b243b1
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-05-02T07:26:41Z
Updated: 2026-05-17T09:42:42Z
Closed: (not closed)
Merged at: (not merged)

Original PR body:

## Summary | 概述

Automatically add compressed observations to the VectorIndex during the observe lifecycle, enabling hybrid BM25+Vector search without manual index building.

在观察生命周期中自动将压缩记忆添加到向量索引，实现 BM25+向量混合搜索。

## Motivation | 动机

The VectorIndex and EmbeddingProvider infrastructure exists but observations were only added to the BM25 index. The vector index remained empty unless manually populated. This meant the hybrid search (BM25 + Vector RRF fusion) couldn't leverage semantic similarity — only BM25 keyword matching was active. For non-English content (Chinese, multilingual), BM25 alone is insufficient because the tokenizer can't handle CJK text well. Vector search bridges this gap.

向量索引基础设施已存在，但观察结果仅被添加到 BM25 索引，向量索引保持为空。混合搜索无法利用语义相似度，仅靠 BM25 关键词匹配。对于中文等非英文内容，纯 BM25 效果差，向量搜索可以弥补这一差距。

## Changes | 改动

- `src/functions/observe.ts`: After synthetic compression and BM25 indexing, auto-embed the narrative and add to vector index
- `src/index.ts`: Pass `vectorIndex` and `embeddingProvider` to `registerObserveFunction`
- Auto-indexing runs within try-catch — embedding failures are logged but don't block observation capture
- Only activates when an embedding provider is configured (opt-in by setting AGENTMEMORY_LOCAL_EMBEDDING_MODEL or cloud API keys)

## Combined with PR #223 (configurable embedding)

```bash
# Enable multilingual hybrid search (BM25 + Vector)
AGENTMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/bge-m3
```

This combination gives agentmemory full Chinese/multilingual semantic search capability — BM25 handles exact terms, vector handles meaning.

## Backwards Compatibility | 向后兼容

New params are optional. When `vectorIndex` or `embeddingProvider` is null/undefined (default), the auto-indexing is skipped. No behavior change for existing deployments.



## Summary by CodeRabbit

## Release Notes

* **New Features**
  * Added optional vector indexing and embedding support for observations, enabling automatic metadata indexing when configured
  * Enhanced with external control over vector auto-indexing functionality
  * Improved resilience through graceful error handling that prevents vector operation failures from disrupting observation workflows





Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[upstream PR 228] feat(vector): auto-index observations into vector store | 向量索引自动集成 #795

Summary | 概述

Motivation | 动机

Changes | 改动

Combined with PR #223 (configurable embedding)

Backwards Compatibility | 向后兼容

Summary by CodeRabbit

Release Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[upstream PR 228] feat(vector): auto-index observations into vector store | 向量索引自动集成 #795

Description

Summary | 概述

Motivation | 动机

Changes | 改动

Combined with PR #223 (configurable embedding)

Backwards Compatibility | 向后兼容

Summary by CodeRabbit

Release Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions