Describe the bug
CosmosMongoCollection._get_index_definitions (python/semantic_kernel/connectors/azure_cosmos_db.py, line 401) sets
cosmosSearchOptions["kind"] from DISTANCE_FUNCTION_MAP_MONGODB — a similarity code ("COS"/"IP"/"L2") —
instead of INDEX_KIND_MAP_MONGODB (the index kind: "vector-ivf"/"vector-hnsw"/"vector-diskann").
This causes three problems:
- The
createIndexes command sends an invalid kind — Cosmos DB for MongoDB vCore requires kind to be one of
vector-ivf/vector-hnsw/vector-diskann, so vector-index creation fails against a live account.
kind ends up equal to similarity.
- The
match index_kind block (line 411) can never match a vector-* case, so the HNSW/IVF/DiskANN tuning options
(m, efConstruction, numList, maxDegree, lBuild) are silently dropped.
The mapped value of INDEX_KIND_MAP_MONGODB is never read (it's only used for a membership check at line 392) — the
tell. The sibling NoSQL path does it correctly at line 149: "type": INDEX_KIND_MAP_NOSQL[field.index_kind].
To Reproduce
Deterministic unit-level repro (no live account needed):
import asyncio
from unittest.mock import AsyncMock, MagicMock
from pymongo import AsyncMongoClient
from semantic_kernel.connectors.azure_cosmos_db import CosmosMongoCollection
from semantic_kernel.data.vector import VectorStoreCollectionDefinition, VectorStoreField
definition = VectorStoreCollectionDefinition(fields=[
VectorStoreField("key", name="id"),
VectorStoreField("data", name="content"),
VectorStoreField("vector", name="vector", dimensions=5,
index_kind="hnsw", distance_function="cosine_similarity"),
])
db = AsyncMock(); db.create_collection = AsyncMock(); db.command = AsyncMock()
client = AsyncMock(spec=AsyncMongoClient); client.get_database = MagicMock(return_value=db)
col = CosmosMongoCollection(collection_name="c", record_type=dict,
definition=definition, mongo_client=client, database_name="d")
asyncio.run(col.ensure_collection_exists(m=16, efConstruction=64))
opts = db.command.call_args.kwargs["command"]["indexes"][1]["cosmosSearchOptions"]
print(opts)
# Actual: {'kind': 'COS', 'similarity': 'COS', 'dimensions': 5}
# -> invalid kind, and m / efConstruction were silently dropped
Against a live Cosmos DB for MongoDB vCore account, ensure_collection_exists() fails because kind="COS" is not a valid vector index kind.
Expected behavior
cosmosSearchOptions["kind"] == "vector-hnsw" and cosmosSearchOptions["similarity"] == "COS" (the two must be distinct), and the HNSW tuning options (m, efConstruction) appear in cosmosSearchOptions.
Screenshots
N/A
Platform
- Language: Python
- Source: main branch of repository (also affects current pip releases)
- AI model: N/A
- IDE: VS Code
- OS: Windows
Additional context
Root cause is a single wrong map lookup at line 401 (should be INDEX_KIND_MAP_MONGODB[field.index_kind]). I have a fix + tests ready and will open a PR shortly. I'll take this — PR incoming.
Describe the bug
CosmosMongoCollection._get_index_definitions(python/semantic_kernel/connectors/azure_cosmos_db.py, line 401) setscosmosSearchOptions["kind"]fromDISTANCE_FUNCTION_MAP_MONGODB— a similarity code ("COS"/"IP"/"L2") —instead of
INDEX_KIND_MAP_MONGODB(the index kind:"vector-ivf"/"vector-hnsw"/"vector-diskann").This causes three problems:
createIndexescommand sends an invalidkind— Cosmos DB for MongoDB vCore requireskindto be one ofvector-ivf/vector-hnsw/vector-diskann, so vector-index creation fails against a live account.kindends up equal tosimilarity.match index_kindblock (line 411) can never match avector-*case, so the HNSW/IVF/DiskANN tuning options(
m,efConstruction,numList,maxDegree,lBuild) are silently dropped.The mapped value of
INDEX_KIND_MAP_MONGODBis never read (it's only used for a membership check at line 392) — thetell. The sibling NoSQL path does it correctly at line 149:
"type": INDEX_KIND_MAP_NOSQL[field.index_kind].To Reproduce
Deterministic unit-level repro (no live account needed):
Against a live Cosmos DB for MongoDB vCore account,
ensure_collection_exists()fails becausekind="COS"is not a valid vector index kind.Expected behavior
cosmosSearchOptions["kind"] == "vector-hnsw"andcosmosSearchOptions["similarity"] == "COS"(the two must be distinct), and the HNSW tuning options (m,efConstruction) appear incosmosSearchOptions.Screenshots
N/A
Platform
Additional context
Root cause is a single wrong map lookup at line 401 (should be
INDEX_KIND_MAP_MONGODB[field.index_kind]). I have a fix + tests ready and will open a PR shortly. I'll take this — PR incoming.