Skip to content

Feat/valkey 4 storage#5703

Open
MatthiasHowellYopp wants to merge 3 commits intocrewAIInc:mainfrom
MatthiasHowellYopp:feat/valkey-4-storage
Open

Feat/valkey 4 storage#5703
MatthiasHowellYopp wants to merge 3 commits intocrewAIInc:mainfrom
MatthiasHowellYopp:feat/valkey-4-storage

Conversation

@MatthiasHowellYopp
Copy link
Copy Markdown
Contributor

Title: feat(valkey): ValkeyStorage vector memory backend

Description:

Part 4/4 of adding Valkey as a storage backend for CrewAI. This PR adds the core vector storage implementation and wires it into the memory system. Depends on parts 1 and 3.

What changed:

valkey_storage.py (new) — Full StorageBackend implementation using Valkey-GLIDE. Supports vector similarity search via Valkey Search module (FLAT and HNSW indexes), scope/category/metadata filtering through tag and numeric indexes, and both sync and async interfaces. Handles lazy client and index initialization, automatic index creation, and batch save operations with JSON + binary embedding serialization.

unified_memory.py — Added "valkey" as a recognized storage option. When selected, reads connection details from VALKEY_URL/REDIS_URL via the cache_config utility from part 1 and instantiates ValkeyStorage.

pyproject.toml (root) — Pinned scrapegraph-py>=1.46.0,<2 to fix an unrelated upstream breakage where 2.x removed the Client class.

Testing:

test_valkey_storage.py — Core CRUD operations, batch saves, record retrieval, deletion, flushing, TTL, metadata handling, connection management
test_valkey_storage_errors.py — Connection failures, index creation errors, malformed data, graceful degradation
test_valkey_storage_scope.py — Hierarchical scope queries, scope listing, child scope enumeration, cross-scope isolation
test_valkey_storage_search.py — Vector similarity search, composite scoring, category filtering, limit/offset, empty results
All tests use mocked Valkey clients and do not require a running Valkey instance.

Extract duplicated Redis URL parsing into a shared cache_config utility.
Introduce ValkeyCache as a lightweight async key/value cache using
valkey-glide. Wire it into A2A task handling, agent card caching, and
file upload caching.

Part 1/4 of Valkey storage implementation.
Add bytes→float validators on MemoryRecord and ItemState to handle
Valkey returning embeddings as raw bytes. Make embed_texts() safe when
called from an async context by using a thread pool. Improve
drain_writes() with per-save timeouts and error logging instead of
raising on failure.

Part 3/4 of Valkey storage implementation.
Add ValkeyStorage, a distributed StorageBackend implementation using
Valkey-GLIDE with Valkey Search for vector similarity. Wire it into
Memory as the 'valkey' storage option. Pin scrapegraph-py<2 to fix
unrelated upstream breakage.

Part 4/4 of Valkey storage implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant