High-performance vector database and search engine in Rust for semantic search, document indexing, and AI applications. Ships as a Cargo workspace (5 crates) with binary RPC + HTTP transports, a React dashboard, and native SDKs for Rust, Python, TypeScript, Go, and C#.
- VectorizerRPC (default, port
15503) — binary MessagePack over TCP, multiplexed connection pool. See wire spec. - REST API (port
15002) — universal HTTP fallback, powers the dashboard and any caller that doesn't speak raw TCP. - gRPC — Qdrant-compatible service.
- GraphQL — full REST parity with async-graphql + GraphiQL playground.
- MCP — 31 focused tools for AI model integration (Cursor, Claude Desktop, etc.).
- UMICP Protocol — native JSON types + tool discovery endpoint.
- SIMD acceleration — AVX2-optimized vector ops with runtime CPU detection (5-10x faster).
- Metal GPU — macOS Apple Silicon via
hive-gpu0.2; logs render real device name, driver, VRAM. - Sub-3ms search (CPU) / <1ms (GPU) via HNSW indexing.
- 4-5x faster than Qdrant in head-to-head benchmarks (0.16-0.23ms vs 0.80-0.87ms avg latency).
.vecdbunified format — 20-30% space savings, automatic snapshots.- Memory-mapped storage — datasets larger than RAM, efficient OS paging.
- Product Quantization — 64x memory reduction with minimal accuracy loss.
- Scalar Quantization + cache hit ratio metrics.
- Raft consensus via openraft (pinned
=0.10.0-alpha.17) — automatic leader election in 1-5s, write-redirect via HTTP 307, WAL-backed durable replication, DNS discovery for Kubernetes headless services. - Master-Replica — TCP streaming replication with full/partial sync, exponential reconnect backoff (5s→60s).
- Distributed sharding — horizontal scaling with automatic routing; distributed hybrid search via
RemoteHybridSearchRPC with dense-only fallback for mixed-version clusters. - HiveHub cluster mode — multi-tenant with quotas, usage tracking, tenant isolation, mandatory MMap storage, 1GB cache cap.
- Semantic similarity — Cosine, Euclidean, Dot Product.
- Hybrid search — Dense + Sparse with Reciprocal Rank Fusion (RRF).
- Intelligent search — query expansion, semantic reranking.
- Multi-collection search across projects.
- Graph relationships — automatic edge discovery, neighbor exploration, shortest-path finding.
- Built-in providers — TF-IDF, BM25, FastEmbed, BERT, MiniLM, custom models.
- Document conversion — PDF, DOCX, XLSX, PPTX, HTML, XML, images (14 formats).
- Qdrant API compatibility — Snapshots, Sharding, Cluster Management, Query (with prefetch), Search Groups, Matrix, Named Vectors (partial), PQ/Binary quantization config.
- Summarization — extractive, keyword, sentence, abstractive (OpenAI GPT).
- JWT + API Key authentication with RBAC, scoped API keys (per-collection permissions), atomic key rotation with grace window, RFC 7662 token introspection, admin audit log (in-memory ring + daily-rotated JSONL).
- API key usage metrics — per-key
usage_count(atomic, lock-free) plus a 30-day per-day ring buffer surfaced viaGET /auth/keys/{id}/usage. - Permission update without rotation —
PUT /auth/keys/{id}/permissionsswapspermissions/scopeswhile keepingkey_hash/id/user_id/created_atimmutable. - Hardened dashboard cookies + CSRF — login/refresh emit
vectorizer_session(HttpOnly; Secure; SameSite=Strict) plus a siblingXSRF-TOKENcookie;require_csrf_middlewareguards every mutating/auth/*and/admin/*request.auth.cookies.insecure_devopt-out for plain-HTTP loopback dev (boot rejects it on0.0.0.0). - Loopback dev-mode auth bypass —
auth.dev_mode_skip_loopbackshort-circuits credential validation aslocal-dev-adminand stampsX-Vectorizer-Dev-Mode: trueon every response. Boot fails on any non-loopback host. - JWT secret is mandatory — boot refuses to start with empty / default / <32 char secrets when auth is enabled.
- First-run root credentials written to
{data_dir}/.root_credentials(0o600), never logged. - Payload encryption — optional ECC-P256 + AES-256-GCM, zero-knowledge, per-collection policies (docs).
- TLS 1.2/1.3 with mTLS, configurable cipher suites, ALPN.
- Per-API-key rate limiting with tiers + overrides.
- Path-traversal guard on file discovery; canonicalized base, symlink-escape refusal.
- Web Dashboard — React + TypeScript; JWT login, graph CRUD (edges, neighbors, paths), collection management, API sandbox, setup wizard with glassmorphism design. Embedded in the binary (~26MB, no external assets needed).
- Desktop GUI — Electron + vis-network for visual database management.
Highlights — see CHANGELOG.md for the full breakdown.
Security — Hardened dashboard cookies + CSRF
POST /auth/loginandPOST /auth/refreshnow set a hardenedvectorizer_sessioncookie (HttpOnly; Secure; SameSite=Strict; Path=/; Max-Age=<jwt_exp>) carrying the JWT, plus a siblingXSRF-TOKENcookie carrying a 32-byte random CSRF token (non-HttpOnlyso the SPA can echo it).- New
require_csrf_middlewarerejectsPOST/PUT/PATCH/DELETErequests under/auth/*and/admin/*with HTTP 403 unless theX-CSRF-Tokenheader matches the token bound to the caller's session JWT.GET/HEAD/OPTIONS,/auth/login,/auth/validate-password, andX-API-Keyrequests bypass the gate. POST /auth/logoutemits expiredSet-Cookieheaders for both cookies and drops the CSRF binding.- New
auth.cookies.insecure_devconfig flag (defaultfalse) drops onlySecurefor plain-HTTP127.0.0.1development. Boot rejects the flag on any non-loopback host. dashboard/src/lib/api-middleware.tsadds acsrfMiddlewarethat reads theXSRF-TOKENcookie and echoes it on every mutating request. Legacyaccess_tokenJSON body preserved for SDK /Authorization: Bearercallers.
Security — Loopback dev-mode auth bypass
- New
auth.dev_mode_skip_loopbackconfig flag (defaultfalse) lets local devs run the dashboard / SDK against127.0.0.1without minting a JWT or echoing tokens on every cURL. - When on: middleware short-circuits with a synthetic
local-dev-adminprincipal (Role::Admin) and every response carriesX-Vectorizer-Dev-Mode: true. CSRF middleware no-ops in the same mode. - Boot fails fast when the flag is on and the bind host is anything other than
127.0.0.1,::1, orlocalhost. Multi-lineWARNbanner emitted at boot when engaged on a loopback bind. - See
docs/users/api/AUTHENTICATION.md→ "Local Development".
Added — API key usage metrics + permission update
- Per-key
usage_count(atomic, lock-free on the validation hot path) plus a 30-day per-key per-day ring buffer. Every successfulvalidate_api_keybumps both. Counter persists across restarts; old payloads deserialize withusage_count: 0. - New
PUT /auth/keys/{id}/permissions(admin-gated) replaces a key's permissions/scopes without rotating the credential.key_hash/id/user_id/created_atstay immutable. - New
GET /auth/keys/{id}/usage?window=<n>(admin-gated) returns the per-day counter ring (default 7, max 30) plus live key view + window total. Zero-count days included so consumers render gap-free sparklines. GET /auth/keyslist response now carriesusage_count+usage_24hper row — dashboard rendersLast 24h+Totalcolumns without an N+1 fetch.- SDK parity (Rust / TypeScript / Python):
update_api_key_permissions,get_api_key_usage, plusApiKeyView,ApiKeyUsageReport,ApiKeyUsageBucket,ApiKeyScope,UpdateApiKeyPermissionsRequest.ApiKeygainsusage_count(additive). - Dashboard
ApiKeysPage.tsxadds the new columns and aUsagebutton that opens a modal with a 14-day SVG sparkline (new dependency-freeSparkline.tsx) + per-day bucket table.
Added — Cluster admin endpoints
POST /cluster/failover— promote a replica with a pre-flight WAL-lag check (HTTP 409 when lag exceedsmax_lag_segments).POST /cluster/replicas/{id}/resync— force a full snapshot + WAL replay on a lagging replica.POST /cluster/peers— add a member or observer peer.POST /cluster/rebalance+GET /cluster/rebalance/status— async shard rebalance using insert-before-delete invariant; poll the job by id.
Added — Auth / RBAC admin endpoints
POST /auth/keys/{id}/rotate— atomic key rotation with a 300 s grace window. Returns{old_key_id, new_key_id, new_token, grace_until}.POST /auth/keys(extended) — optionalscopes: [{collection, permissions}]for collection-scoped keys; empty list = default-deny on scope-aware routes. Existing global-key callers unaffected.POST /auth/introspect— RFC 7662 token introspection for any JWT or API key.GET /auth/audit— admin-only audit log query (in-memory ring + daily-rotated JSONL files), filterable byfrom,to,actor,action.- New
AuditLoggerships record events through an unboundedmpscchannel — never blocks the handler hot-path.
Added — Tier-demotion API (#265)
POST /collections/{src}/vectors/move— cross-collection move with insert-before-delete invariant; per-id status (ok | missing_in_src | dst_insert_failed | src_delete_failed) so a mid-batch crash leaves a recoverable duplicate, never data loss. Seedocs/users/api/API_REFERENCE.md.- All three SDKs (Rust / TypeScript / Python):
delete_vector,delete_vectors(batch with per-idDeleteReport),move_to_collection(cross-collection withMoveReport). TypeScript / Pythondelete_vectorsnow return the canonicalDeleteReportshape — callers asserting on the old{deleted: number}/boolshapes need to readreport.deleted/report.results.
Build — Docker pipeline overhaul
- Cuts cold local multi-arch build from ~30–45 min to ~15–20 min, warm to under 10 min via four changes:
- Buildx registry cache on
hivehub/vectorizer-cache:buildx(opt out with-NoCache). - Dedicated
release-dockerCargo profile (lto = false,codegen-units = 16,incremental = false) — ~30 % faster compile, ~50 % lower peak rustc memory; ~10–15 % lower throughput on hot paths vs the hostreleaseprofile, which is unchanged. - Drop in-image
cargo sbom— SBOM now exclusively from BuildKit's--sbom=truesyft attestation. - Native arm64 in CI + Docker Hub publish — per-arch matrix, manifest-list stitched and pushed to both
ghcr.io/hivellm/vectorizerandhivehub/vectorizer. Eliminates QEMU emulation from the arm64 build.
- Buildx registry cache on
- Operator runbook:
docs/development/docker-builds.md.
For prior releases see CHANGELOG.md.
curl -fsSL https://raw.githubusercontent.com/hivellm/vectorizer/main/scripts/install.sh | bashInstalls CLI + systemd service. Commands: sudo systemctl {status|restart|stop} vectorizer, sudo journalctl -u vectorizer -f.
powershell -c "irm https://raw.githubusercontent.com/hivellm/vectorizer/main/scripts/install.ps1 | iex"Installs CLI + Windows Service (requires Admin). Commands: Get-Service Vectorizer, {Start|Stop|Restart}-Service Vectorizer.
docker run -d \
--name vectorizer \
-p 15002:15002 -p 15503:15503 \
-v $(pwd)/vectorizer-data:/vectorizer/data \
-e VECTORIZER_AUTH_ENABLED=true \
-e VECTORIZER_ADMIN_USERNAME=admin \
-e VECTORIZER_ADMIN_PASSWORD=your-secure-password \
-e VECTORIZER_JWT_SECRET=$(openssl rand -hex 64) \
--restart unless-stopped \
hivehub/vectorizer:latestDocker Compose with profiles:
cp .env.example .env
# Edit .env with your credentials
docker compose --profile default up -d # standalone
docker compose --profile dev up -d # dev overlay
docker compose --profile ha up -d # Raft cluster
docker compose --profile hub up -d # multi-tenantProfiles are mutually exclusive on host port 15002.
Images: Docker Hub · GHCR
git clone https://github.com/hivellm/vectorizer.git
cd vectorizer
cargo build --release # Basic
cargo build --release --features hive-gpu # macOS Metal
cargo build --release --features full # All features
./target/release/vectorizer| Surface | URL | Notes |
|---|---|---|
| VectorizerRPC (primary) | vectorizer://localhost:15503 |
Binary MessagePack over TCP — see operator guide |
| REST API | http://localhost:15002 |
Universal HTTP fallback |
| Web Dashboard | http://localhost:15002/dashboard/ |
React UI, embedded in binary |
| MCP Server | http://localhost:15002/mcp |
31 tools for AI agents |
| GraphQL | http://localhost:15002/graphql |
GraphiQL at /graphql |
| UMICP Discovery | http://localhost:15002/umicp/discover |
|
| Health Check | http://localhost:15002/health |
Upgrading from v2.x? RPC is now on by default on port
15503. REST is unchanged. If you can't expose the new port, setrpc.enabled: false. See v3.x migration guide.
Configs live under config/:
config/
├── config.yml # Base config (your deployment)
├── config.example.yml # Reference
├── modes/
│ ├── dev.yml # Layered override: verbose logs, loopback, watcher on
│ └── production.yml # Layered override: warn logs, larger threads/cache, zstd, scheduled snapshots
└── presets/ # Standalone full configs (legacy style)
├── production.yml
├── cluster.yml
├── hub.yml
└── development.yml
Layered loader (recommended):
VECTORIZER_MODE=production ./target/release/vectorizerMerges config/modes/production.yml over config/config.yml. Typos in the mode override fail fast at boot.
Auth is enabled by default in Docker. Default creds — change in production.
# Login
curl -X POST http://localhost:15002/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin"}'
# JWT in requests
curl http://localhost:15002/collections \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
# Create API key (JWT required)
curl -X POST http://localhost:15002/auth/keys \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{"name":"Production","permissions":["read","write"],"expires_in_days":90}'
# API key in requests (NO Bearer prefix)
curl http://localhost:15002/collections \
-H "Authorization: YOUR_API_KEY"| Method | Header | Use case |
|---|---|---|
| JWT | Authorization: Bearer <token> |
Dashboard, short-lived sessions |
| API Key | Authorization: <key> |
MCP, CLI, long-lived integrations |
Production must set:
VECTORIZER_JWT_SECRET— ≥32 chars, not the historical default. Boot aborts otherwise.VECTORIZER_ADMIN_PASSWORD— strong, ≥32 chars.
First-run root credentials are written to {data_dir}/.root_credentials (0o600), never printed to stdout. Read and delete after first login.
See Docker Authentication Guide and Security Policy.
| Metric | Value |
|---|---|
| Search latency (CPU) | < 3ms |
| Search latency (Metal GPU) | < 1ms |
| Throughput | 4,400-6,000 QPS (vs Qdrant 1,100-1,300) |
| Storage reduction | 20-30% (.vecdb) + PQ 64x |
| MCP tools | 31 |
| Document formats | 14 |
- Search: 4-5x faster (0.16-0.23ms vs 0.80-0.87ms avg latency).
- Insert: Fire-and-forget pattern, configurable batch / body limits, background processing.
- Scenarios: Small (1K) / Medium (5K) / Large (10K) vectors × dimensions 384 / 512 / 768.
| Feature | Vectorizer | Qdrant | pgvector | Pinecone | Weaviate | Milvus | Chroma |
|---|---|---|---|---|---|---|---|
| Core | |||||||
| Language | Rust | Rust | C | C++/Go | Go | C++/Go | Python |
| License | Apache 2.0 | Apache 2.0 | PostgreSQL | Proprietary | BSD | Apache 2.0 | Apache 2.0 |
| APIs | |||||||
| REST | ✅ | ✅ | via PG | ✅ | ✅ | ✅ | ✅ |
| gRPC (Qdrant-compat) | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
| GraphQL | ✅ + GraphiQL | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
| MCP | ✅ 31 tools | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Binary RPC | ✅ MessagePack | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| SDKs | Rust, Python, TS, Go, C# | All | All | Most | Most | Most | Python |
| Performance | |||||||
| Search latency | < 3ms CPU / < 1ms GPU | 1-5ms | 5-50ms | 50-100ms | 10-50ms | 5-20ms | 10-100ms |
| SIMD | ✅ AVX2 | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ |
| GPU | ✅ Metal | ✅ CUDA | ❌ | ✅ Cloud | ❌ | ✅ CUDA | ❌ |
| Storage | |||||||
| HNSW | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| PQ (64x) | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | ❌ |
| Scalar Quantization | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | ❌ |
| MMap | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ |
| Advanced | |||||||
| Graph Relationships | ✅ auto + GUI | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
| Document Processing | ✅ 14 formats | ❌ | ❌ | ❌ | ✅ | ❌ | ✅ |
| Hybrid Search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Query Expansion | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Qdrant API compat | ✅ + migration | N/A | ❌ | ❌ | ❌ | ❌ | ❌ |
| Scaling | |||||||
| Sharding | ✅ | ✅ | via PG | ✅ Cloud | ✅ | ✅ | ❌ |
| Replication | ✅ Raft + Master-Replica | ✅ | via PG | ✅ Cloud | ✅ | ✅ | ❌ |
| Management | |||||||
| Dashboard | ✅ React + graph GUI | ✅ basic | pgAdmin | ✅ Cloud | ✅ | ✅ | ✅ basic |
| Desktop GUI | ✅ Electron | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Security | |||||||
| JWT + API Keys | ✅ | ✅ | via PG | ✅ Cloud | ✅ | ✅ | ✅ |
| Payload Encryption | ✅ ECC-P256 + AES-GCM | ❌ | via PG | ✅ Cloud | ❌ | ❌ | ❌ |
- MCP integration (31 tools) — native AI-agent protocol.
- Graph relationships — auto-discovery + full GUI (edges, path-finding, neighbor exploration).
- GraphQL — full REST parity + GraphiQL.
- Document processing — 14 formats built in.
- Qdrant compatibility — full API + migration tools.
- Performance — 4-5x faster than Qdrant in benchmarks.
- Binary RPC default — MessagePack over TCP on port 15503 for low-overhead client traffic.
- Complete SDK coverage — Rust, Python, TypeScript (+JS), Go, C# — all on v3.0.0.
Best fit: AI apps needing MCP, document ingestion, graph relationships, and sub-ms search with an embedded dashboard.
- RAG systems — semantic search with automatic document conversion.
- Document search — PDFs, Office, web content.
- Code analysis — semantic code navigation.
- Knowledge bases — enterprise multi-format search.
Cursor / Claude Desktop config:
{
"mcpServers": {
"vectorizer": {
"url": "http://localhost:15002/mcp",
"type": "streamablehttp"
}
}
}Core operations (9)
list_collections · create_collection · get_collection_info · insert_text · get_vector · update_vector · delete_vector · search · multi_collection_search
Advanced search (4)
search_intelligent (query expansion) · search_semantic (reranking) · search_extra (combined) · search_hybrid (dense + sparse RRF)
Discovery & files (7)
filter_collections · expand_queries · get_file_content · list_files · get_file_chunks · get_project_outline · get_related_files
Graph (8)
graph_list_nodes · graph_get_neighbors · graph_find_related · graph_find_path · graph_create_edge · graph_delete_edge · graph_discover_edges · graph_discover_status
Maintenance (3)
list_empty_collections · cleanup_empty_collections · get_collection_stats
Cluster-management operations are REST-only for security.
Server-side at v3.1.0. The Rust SDK tracks server versioning and is also at v3.1.0; the TypeScript, Python, Go, and C# SDKs are on v3.0.x and bump when they need a breaking server contract. The TypeScript SDK ships compiled CJS + ESM — usable from plain JavaScript, no separate JS package needed.
| SDK | Install |
|---|---|
| Python | pip install vectorizer-sdk |
| TypeScript / JS | npm install @hivehub/vectorizer-sdk |
| Rust | cargo add vectorizer-sdk |
| C# | dotnet add package Vectorizer.Sdk (REST) · Vectorizer.Sdk.Rpc (RPC) |
| Go | go get github.com/hivellm/vectorizer-sdk-go |
Every SDK accepts both vectorizer://host[:port] (RPC, default port 15503) and http(s)://host[:port] (REST) URLs through the same endpoint parser.
- Config migration — parse Qdrant YAML/JSON → Vectorizer format.
- Data migration — export from Qdrant, import into Vectorizer.
- Validation — integrity + compatibility checks.
- REST compatibility — full Qdrant API at
/qdrant/*.
use vectorizer::migration::qdrant::{QdrantDataExporter, QdrantDataImporter};
let exported = QdrantDataExporter::export_collection(
"http://localhost:6333",
"my_collection"
).await?;
let result = QdrantDataImporter::import_collection(&store, &exported).await?;Multi-tenant cluster mode integration with HiveHub.Cloud.
- Tenant isolation — owner-scoped collections.
- Quota enforcement — collections / vectors / storage per tenant.
- Usage tracking — automatic reporting.
- User-scoped backups.
hub:
enabled: true
api_url: "https://api.hivehub.cloud"
tenant_isolation: "collection"
usage_report_interval: 300export HIVEHUB_SERVICE_API_KEY="your-service-api-key"Cluster-mode requirements (enforced at boot):
| Requirement | Default |
|---|---|
| MMap storage (Memory storage rejected) | Enforced |
| Max cache memory across all caches | 1 GB |
| File watcher | Disabled |
| Strict config validation | Enabled |
cluster:
enabled: true
node_id: "node-1"
memory:
max_cache_memory_bytes: 1073741824
enforce_mmap_storage: true
disable_file_watcher: true
strict_validation: trueSee HiveHub Integration and Cluster Memory Limits.
crates/
├── vectorizer-core/ # Foundation: error, codec, quantization, simd, compression, paths
├── vectorizer-protocol/ # RPC wire types + tonic-generated gRPC
├── vectorizer/ # Engine (umbrella): db, embedding, models, cache, persistence, search, ...
├── vectorizer-server/ # Transport: HTTP / gRPC / MCP / RPC + binary
└── vectorizer-cli/ # CLI binaries
sdks/rust/ # Rust SDK — re-exports vectorizer-protocol wire types
Runtime directories resolve to platform-standard locations (~/.local/share/vectorizer/ on Linux, ~/Library/Application Support/vectorizer/ on macOS, %APPDATA%\vectorizer\ on Windows), overridable via VECTORIZER_DATA_DIR / VECTORIZER_LOGS_DIR.
- User Documentation — install + tutorials
- API Reference — REST
- VectorizerRPC Spec — wire protocol
- RPC Operator Guide
- Configuration — layered loader
- Bulk-upsert Backpressure Runbook —
429/Retry-After, vocab-build cap, ops metrics (#263) - v3.x Migration — RPC-default rollout
- Dashboard Integration
- Qdrant Compatibility
- HiveHub Integration
- Cluster Memory Limits
- MCP Guide
- Encryption
- Technical Specs — architecture, performance, implementation
Apache License 2.0 — see LICENSE.
See CONTRIBUTING.md.