Skip to content

feat(search): Search V3 'Project Brain' with Cohere Reranking#224

Merged
DevanshuNEU merged 16 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feature/search-v3-project-brain
Jan 26, 2026
Merged

feat(search): Search V3 'Project Brain' with Cohere Reranking#224
DevanshuNEU merged 16 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feature/search-v3-project-brain

Conversation

@DevanshuNEU

@DevanshuNEU DevanshuNEU commented Jan 26, 2026

Copy link
Copy Markdown
Collaborator

🧠 Search V3 - Project Brain

What's New

Complete rewrite of the search system with intelligent query understanding and Cohere reranking.

Key Improvements

Metric V2 V3 Improvement
Test Files in Top 3 36 3 92% reduction
Query Wins 0 33 79% win rate
Cohere Reranking +169% relevance

Changes

Search Engine (search_engine.py)

  • YAML document formatting for optimal Cohere performance
  • Post-rerank test file filtering (fixes test pollution leakage)
  • Relevance threshold filtering (score >= 0.01)
  • Fixed metrics calls (increment/timing)

Integration (integration.py)

  • Fixed Cohere API key passing

Validation

  • Tested across Starlette + Flask repositories
  • 42 queries tested, 33 V3 wins
  • Edge cases validated (empty queries, special chars)

Test Scripts Added

  • validate_cohere_rerank.py - Cohere ON vs OFF comparison
  • final_v3_test.py - Comprehensive V2 vs V3 testing
  • cross_repo_test.py - Multi-repository validation

Environment Variables Required

  • COHERE_API_KEY - For reranking (optional, graceful fallback)

Summary by CodeRabbit

  • New Features
    • V3 search enabled by default with intent-aware query expansion, code-graph boosting, optional test-file inclusion, and a new search_version field in responses.
  • Chores
    • Added Voyage embeddings support and example environment keys; added voyageai dependency and a new point-in-time gauge metric.
  • Tests
    • Added extensive benchmarking and evaluation scripts plus integration tests validating V3 components and end-to-end behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

🎯 Major Features:
- Voyage AI code-specific embeddings (voyage-code-3)
  - 13.8% better accuracy than OpenAI for code search
  - Auto-fallback to OpenAI if Voyage unavailable

- Query Understanding & Intent Classification
  - Detects: FIND, EXPLAIN, USAGE, DEBUG intents
  - Smart query expansion with code synonyms
  - 'json' → 'JSONResponse, json_response, application/json'

- Code Graph Ranking (PageRank-style)
  - Integrates with existing dependency analyzer
  - Importance scoring based on usage/references
  - Files depended on by many others rank higher

- Test File Filtering (CEO request!)
  - Auto-detection of test files
  - User-configurable via include_tests parameter
  - Default: exclude tests (-70% penalty when included)
  - Core files boost (+30%): main.py, index.js, etc.

📁 New Files:
- services/search_v3/embedding_provider.py - Voyage/OpenAI abstraction
- services/search_v3/query_understanding.py - Intent classification
- services/search_v3/code_graph_ranker.py - Importance ranking
- services/search_v3/search_engine.py - Main orchestrator
- services/search_v3/integration.py - Bridge to indexer
- tests/test_search_v3.py - 14 passing unit tests
- scripts/benchmark_search_v3.py - V2 vs V3 comparison

📝 Modified:
- routes/playground.py - Added use_v3 and include_tests params
- services/indexer_optimized.py - Added search_v3() method
- requirements.txt - Added voyageai>=0.3.0
- .env.example - Added VOYAGE_API_KEY

🔧 API Changes:
- Search now uses V3 by default (use_v3=True)
- New response field: search_version ('v2' or 'v3')
- New result field: is_test_file (boolean)

Tests: 14/14 passing
- Fix Cohere API key passing in integration layer
- Add YAML document formatting for optimal Cohere performance
- Add post-rerank test file filtering (92% test pollution reduction)
- Fix metrics calls (increment/timing instead of gauge/counter)
- Add relevance threshold filtering (score >= 0.01)

Performance improvements:
- V3 wins 79% of queries vs V2
- Test pollution: 36 → 3 (92% reduction)
- 33/42 query wins across Starlette + Flask repos

Test scripts added for validation:
- validate_cohere_rerank.py
- final_v3_test.py
- cross_repo_test.py
- extended_v3_test.py
@vercel

vercel Bot commented Jan 26, 2026

Copy link
Copy Markdown

@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jan 26, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds a Search V3 semantic code-search pipeline (embeddings, query understanding, code‑graph ranking, optional Cohere reranking), integrates it into the optimized indexer and playground with V2 fallback, and introduces embedding providers, test-detection utilities, observability gauges, V2 tokenization improvements, tests, and many evaluation scripts.

Changes

Cohort / File(s) Summary
Configuration & Dependencies
backend/.env.example, backend/requirements.txt
Adds COHERE_API_KEY and VOYAGE_API_KEY to env example and voyageai>=0.3.0 to requirements.
Playground Route
backend/routes/playground.py
Adds use_v3 and include_tests flags to request model, incorporates them into cache key, conditionally calls indexer.search_v3 or V2, reformats/caches V3-style results, and returns search_version.
Indexer Integration
backend/services/indexer_optimized.py
Adds search_v3(...), refactors embedding wiring, integrates Search V3 via SearchV3Integration with metrics and V2 fallback; applies test-file filtering when falling back.
Search V3 Package
backend/services/search_v3/*
New package: embedding provider (embedding_provider.py), query understanding (query_understanding.py), code-graph ranker (code_graph_ranker.py), search engine (search_engine.py), integration singleton (integration.py), and re-exporting __init__.py.
Search V2 Enhancements
backend/services/search_v2/hybrid_searcher.py
Adds camelCase-aware tokenization, richer BM25 corpus fields, and updated doc generation/fallbacks for reranking.
Test Detection Utilities
backend/utils/test_detection.py
New centralized test-file detection, filtering, and top-N checks used across V2/V3 and scripts.
Observability
backend/services/observability.py
Adds in-memory gauges API (Metrics.gauge, _gauges) and includes gauges in get_stats(); reset clears gauges.
Benchmarking & Validation Scripts
backend/scripts/*
Adds many new standalone scripts for evaluation and validation: benchmark_search_v3.py, cross_repo_test.py, edge_case_test.py, extended_query_test.py, extended_v3_test.py, final_v3_test.py, human_query_test.py, validate_cohere_rerank.py.
Tests
backend/tests/*
Adds test_search_v3.py integration tests and updates test_anonymous_indexing.py to use search_v3 and new result schema.
Miscellaneous
backend/utils/__init__.py
Adds a minimal module comment.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Playground
    participant IndexerOptimized
    participant SearchV3Integration
    participant QueryUnderstanding
    participant EmbeddingProvider
    participant Pinecone
    participant CodeGraphRanker
    participant Cohere

    Client->>Playground: search(query, use_v3, include_tests)
    Playground->>IndexerOptimized: search_v3 or search_v2(query,...)
    IndexerOptimized->>SearchV3Integration: search(query, repo_id, config)
    SearchV3Integration->>QueryUnderstanding: analyze(query)
    QueryUnderstanding-->>SearchV3Integration: QueryAnalysis
    SearchV3Integration->>EmbeddingProvider: embed_query(expanded_query)
    EmbeddingProvider-->>SearchV3Integration: embedding
    SearchV3Integration->>Pinecone: vector_search(embedding, top_k)
    Pinecone-->>SearchV3Integration: raw_matches
    SearchV3Integration->>CodeGraphRanker: calculate_importance(repo_id, deps)
    CodeGraphRanker-->>SearchV3Integration: importance_map
    SearchV3Integration->>CodeGraphRanker: boost_and_filter_results(matches, include_tests)
    CodeGraphRanker-->>SearchV3Integration: boosted_results
    alt use_reranking
        SearchV3Integration->>Cohere: rerank(query, results_yaml)
        Cohere-->>SearchV3Integration: reranked_results
    end
    SearchV3Integration-->>IndexerOptimized: final_results
    IndexerOptimized-->>Playground: normalized_results (search_version)
    Playground-->>Client: response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 I hop through lines of code and light,

Embeddings glow, intent takes flight,
Graphs nudge the best, tests tucked away,
Reranks polish answers for the day,
V3 sings soft — a clever rabbit's delight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly reflects the main feature being introduced: a new Search V3 system called 'Project Brain' with Cohere Reranking integration, which aligns with the substantial changeset adding new search pipelines, embedding providers, query understanding, and validation scripts.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

- Add pro_user parameter to search methods
- Only use Cohere reranking when pro_user=True
- Free users get base V3 search (still good, just no Cohere)
- Add reranking_used to logs for observability

Cost control: Cohere charges per rerank call

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Fix all issues with AI agents
In `@backend/routes/playground.py`:
- Around line 52-54: The cache key currently only uses query + repo and
therefore returns incorrect results when use_v3 or include_tests change; update
the cache key construction where results are cached (the block that builds the
key for the playground search — the same area referenced around lines 435-472)
to incorporate the flags use_v3 and include_tests (e.g., append or include their
boolean values) so cached entries are distinct per combination, or alternatively
bypass caching when either flag is non-default; ensure you update every caching
path in the route handler that currently keys by query+repo so both use_v3 and
include_tests are considered.

In `@backend/scripts/benchmark_search_v3.py`:
- Around line 226-235: The current code treats the method is_voyage_enabled as a
truthy attribute; change the conditional to call the method: use
v3.is_voyage_enabled() instead of v3.is_voyage_enabled, i.e., update the branch
in the get_search_v3() block to call the method; also replace the bare except
with a specific Exception capture (e.g., except Exception as e) so errors can be
logged or printed for visibility while preserving the same success/fallback
behavior.

In `@backend/scripts/cross_repo_test.py`:
- Around line 9-10: Remove the hard-coded VOYAGE_API_KEY assignment in
cross_repo_test.py (the os.environ["VOYAGE_API_KEY"] = "...") and instead load
the key from the environment or a local .env file; update the script to read
os.environ.get("VOYAGE_API_KEY") (or use python-dotenv to load a .env) and fail
fast with a clear message if not present, and ensure the leaked key is
rotated/removed from history and CI uses secure secret storage rather than
embedding the value in source.

In `@backend/scripts/edge_case_test.py`:
- Around line 9-10: The file currently sets os.environ["VOYAGE_API_KEY"] to a
hard-coded secret; remove that assignment and instead load the key from the
environment (or via a local .env loader like python-dotenv) so the code reads
os.environ.get("VOYAGE_API_KEY") where needed; rotate the leaked key immediately
and ensure no other commits contain the old value (search for VOYAGE_API_KEY),
and add a note in README or env.example showing the expected env var name rather
than embedding the secret in backend/scripts/edge_case_test.py (references:
sys.path.insert and os.environ["VOYAGE_API_KEY"]).

In `@backend/scripts/extended_query_test.py`:
- Around line 10-12: Remove the hardcoded VOYAGE_API_KEY assignment: delete the
line that sets os.environ["VOYAGE_API_KEY"] to the literal key, and instead read
the key from the environment (os.environ.get("VOYAGE_API_KEY")) or from a local
secrets loader (e.g., python-dotenv) if needed; update any test setup in
extended_query_test.py to fail loudly or skip tests when VOYAGE_API_KEY is
missing so CI/devs know to provide credentials, and ensure the leaked key is
rotated/invalidated outside this repo.

In `@backend/scripts/extended_v3_test.py`:
- Around line 62-71: Replace the two bare except: clauses that wrap calls to
indexer.search_v2 and indexer.search_v3 with except Exception as e: so
SystemExit/KeyboardInterrupt can propagate; capture the exception into a
variable (e) and either log it (e.g., logger.exception or print) or ignore after
setting v2_results = [] / v3_results = [] to preserve current behavior; locate
the try/except blocks around indexer.search_v2(query, repo_id, top_k=3) and
indexer.search_v3(query, repo_id, top_k=3, include_tests=False) and make the
change for both.
- Around line 10-11: Remove the hardcoded production API key assignment in
extended_v3_test.py (the os.environ["VOYAGE_API_KEY"] = "...") and replace it by
reading the key from a secure source at runtime (e.g.,
os.getenv("VOYAGE_API_KEY") or a config/secret manager) so no secret is
committed; update any test harness that depends on this to allow injecting the
key via environment or a CI secret, and ensure the exposed key is rotated
immediately since it may be compromised.

In `@backend/scripts/final_v3_test.py`:
- Around line 10-12: Remove the hardcoded secret assignment in final_v3_test.py
by deleting the line that sets os.environ["VOYAGE_API_KEY"] = "...", instead
read the API key from the environment or test fixture (e.g., expect
os.environ.get("VOYAGE_API_KEY") in tests) and update any test setup to inject a
mock/stub key when needed; also ensure the real key is rotated and stored in
CI/secret manager rather than in code and update any documentation or test
harness that relied on the hardcoded value.

In `@backend/scripts/human_query_test.py`:
- Around line 13-15: Remove the hardcoded secret assignment to
os.environ["VOYAGE_API_KEY"] in backend/scripts/human_query_test.py (delete the
line that sets the live key), rely on the existing environment value or load
from a secure config, and add a check that raises a clear error if
VOYAGE_API_KEY is not present; rotate the exposed key immediately since it was
leaked.

In `@backend/services/indexer_optimized.py`:
- Around line 541-550: Remove the redundant manual assignment of the embedding
function: delete the line that sets searcher.embed = embed_query because the
HybridSearcher constructor already assigns embedding_fn to self.embed; leave the
async embed_query(q: str) -> List[float] helper and the HybridSearcher(...)
construction intact so the embedding_fn passed to the constructor is used.

In `@backend/services/search_v3/code_graph_ranker.py`:
- Around line 67-134: The calculate_importance function currently iterates only
over file_dependencies.keys(), which skips files that only appear as dependency
values; change the iteration to use the union of all files (e.g., all_files =
set(file_dependencies.keys()) | set(dependent_counts.keys())) so files that are
depended-upon but not present as keys get scored and added to importance_map;
update references to file_path iteration and keep dep_count =
dependent_counts.get(file_path, 0) and the rest of the scoring logic unchanged
(symbols: calculate_importance, file_dependencies, dependent_counts,
importance_map).

In `@backend/services/search_v3/integration.py`:
- Around line 51-55: Replace the bare except in the Voyage embedding init with a
guarded exception handler that catches only Exception (not BaseException) and
logs the failure before falling back; specifically update the try/except around
get_embedding_provider("voyage") in integration.py (the block that sets
self._voyage_embedding_provider) to catch Exception and call logger.exception or
logger.error with the exception info and a clear message like "Failed to
initialize Voyage embedding provider, falling back to index provider", then set
self._voyage_embedding_provider = self._index_embedding_provider as the
fallback.
- Around line 26-33: The singleton _instance in SearchV3Integration is not
thread-safe; add a class-level lock (e.g., _lock = threading.Lock()) and
implement double-checked locking in get_instance to prevent concurrent
initializations: check cls._instance, acquire cls._lock, check again, then
assign cls._instance = cls() if still None, and finally return cls._instance;
reference the class name SearchV3Integration and methods/vars _instance,
get_instance, and _lock when making the change.

In `@backend/services/search_v3/query_understanding.py`:
- Around line 117-133: The analyze method lowercases the query before calling
_extract_code_terms, which prevents CamelCase detection; change the call site so
_extract_code_terms receives the original query (e.g., pass the raw query
variable) while keeping query_lower for intent detection, _extract_keywords, and
_expand_query/_expand_query (synonym) logic; do the same pattern in the other
occurrence that mirrors this logic (the block around the second analyze-like
flow) so CamelCase matches in _extract_code_terms remain intact.

In `@backend/services/search_v3/search_engine.py`:
- Around line 294-303: The code is using deprecated asyncio.get_event_loop()
inside an async function; replace it with asyncio.get_running_loop() in the
block that calls self.cohere_client.rerank so the executor call remains the
same. Locate the lambda invoking self.cohere_client.rerank (producing
rerank_response from query, documents, top_k) and change loop =
asyncio.get_event_loop() to loop = asyncio.get_running_loop() to avoid
deprecation warnings and ensure correct behavior at runtime.
- Around line 323-326: The code records an average relevance score using
metrics.timing (which is for durations); change the metric call in the reranking
block so it records a value-type metric instead (e.g., use metrics.gauge or
metrics.histogram) for avg_score rather than metrics.timing: locate the
rerank-related variables (reranked, avg_score) in search_engine.py and replace
the metrics.timing("search.rerank.avg_score", ...) call with a gauge/histogram
call (e.g., metrics.gauge("search.rerank.avg_score", avg_score * 100)) while
keeping the metrics.increment("search.rerank.success") line intact.
🧹 Nitpick comments (8)
backend/tests/test_search_v3.py (1)

44-50: Minor: Prefer direct boolean assertions.

Using assert analysis.should_include_tests instead of == True is more Pythonic and reads better. This is purely stylistic.

Suggested style improvement
     def test_include_tests_detection(self):
         """Should detect when tests should be included"""
         analysis = self.qu.analyze("show me test examples for auth")
-        assert analysis.should_include_tests == True
+        assert analysis.should_include_tests
         
         analysis = self.qu.analyze("find auth handler")
-        assert analysis.should_include_tests == False
+        assert not analysis.should_include_tests
backend/services/search_v3/embedding_provider.py (1)

71-124: Prefer get_running_loop() in async context.

asyncio.get_event_loop() is deprecated in newer Python versions when already in an async context. get_running_loop() is safer under asyncio.run.

♻️ Suggested change
-                loop = asyncio.get_event_loop()
+                loop = asyncio.get_running_loop()
...
-            loop = asyncio.get_event_loop()
+            loop = asyncio.get_running_loop()
backend/services/search_v3/integration.py (2)

112-115: Accessing private method _is_test_file breaks encapsulation.

Consider using the public filter_test_files method or requesting that CodeGraphRanker expose a public is_test_file method.


146-147: Redundant parameter in conditional branch.

When this branch executes, include_tests is always False. Consider passing the literal for clarity:

         if not include_tests:
-            boosted = self._code_graph_ranker.filter_test_files(boosted, include_tests)
+            boosted = self._code_graph_ranker.filter_test_files(boosted, False)
backend/services/search_v3/search_engine.py (4)

33-48: SearchResult dataclass is defined but unused.

All methods return List[Dict] instead of List[SearchResult]. Consider either using this dataclass for type safety or removing it to avoid confusion.


148-156: Post-rerank test filtering accesses private method.

Line 153 directly accesses self.code_graph_ranker._is_test_file(). Consider using the public filter_test_files method for consistency:

                 if not include_tests:
-                    results = [r for r in results if not self.code_graph_ranker._is_test_file(r.get('file_path', ''))]
+                    results = self.code_graph_ranker.filter_test_files(results, include_tests=False)

225-244: Simplified BM25 may produce false-positive matches.

The substring check term in text matches partial words (e.g., query term "get" matches "together"). Consider using word boundaries:

import re
matches = sum(1 for term in query_terms if re.search(rf'\b{re.escape(term)}\b', text))

This is a minor issue given the "simplified" comment, but worth noting for future improvement.


343-376: Convenience function creates new SearchEngineV3 on every call.

Each invocation instantiates new embedding provider, Cohere client, and ranker objects. For production use, prefer the singleton via get_search_v3() from integration.py. Consider adding a note or deprecation warning:

 async def search_v3(...) -> List[Dict]:
     """
     Convenience function for Search V3
+    
+    Note: Creates a new engine instance per call. For production use,
+    prefer get_search_v3().search() which uses a singleton.
     ...
     """

Comment thread backend/routes/playground.py
Comment thread backend/routes/playground.py
Comment thread backend/scripts/benchmark_search_v3.py Outdated
Comment thread backend/scripts/cross_repo_test.py Outdated
Comment thread backend/scripts/edge_case_test.py Outdated
Comment thread backend/services/search_v3/integration.py
Comment thread backend/services/search_v3/integration.py
Comment thread backend/services/search_v3/query_understanding.py
Comment thread backend/services/search_v3/search_engine.py Outdated
Comment thread backend/services/search_v3/search_engine.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@backend/services/search_v3/search_engine.py`:
- Around line 230-249: The inline comment in _apply_bm25_boost is inaccurate: it
says "check for keyword matches in name and code" but the constructed text only
uses name, qualified_name, and summary (the variable text and query_terms).
Either include the code field in the text aggregation (add result.get('code',
'') to the text) if you intend to boost on code, or change the comment to
accurately state that boosts are applied based on name, qualified_name, and
summary; update the comment above the text variable accordingly and keep the
rest of the function unchanged.
- Around line 348-381: The convenience function search_v3 instantiates a new
SearchEngineV3 on every call and omits the pro_user flag; change it to reuse a
cached/shared SearchEngineV3 instance (e.g., a module-level singleton named
engine or by calling the integration layer's factory) to avoid reinitializing
embeddings/rerankers per request, and add a pro_user: bool = False parameter to
search_v3 that is passed through into the SearchConfig (or into engine.search)
so Cohere reranking can be enabled when use_reranking=True and pro_user is true;
update references to SearchEngineV3, SearchConfig, and the search call
accordingly.
- Around line 251-267: The _format_doc_as_yaml method interpolates fields (name,
type, qualified_name, signature, summary, code) directly into a YAML string
which can break when values contain special YAML characters; update
_format_doc_as_yaml to build a Python dict with those keys and then serialize it
with a YAML library (e.g., yaml.safe_dump) to ensure proper quoting/escaping and
preserve the code block as a literal block (|) or use the library's block style
option for the code field; reference the function name _format_doc_as_yaml and
the local variables file_name and code_snippet when making the change so you
replace the manual f-string construction with proper yaml.safe_dump of { "name":
..., "type": ..., "file": file_name, "qualified_name": ..., "signature": ...,
"summary": ..., "code": code_snippet }.
🧹 Nitpick comments (2)
backend/services/search_v3/integration.py (2)

112-115: Exposing internal _is_test_file method through public API.

The is_test_file method delegates to self._code_graph_ranker._is_test_file(), which is a private method (prefixed with _). This creates a coupling to an internal implementation detail that could break if CodeGraphRanker refactors its internals.

Consider either making _is_test_file a public method on CodeGraphRanker, or using the public filter_test_files method instead.


151-184: Potentially confusing interaction between use_reranking and pro_user parameters.

The search method accepts both use_reranking (default True) and pro_user (default False). In search_engine.py, reranking only happens when both are true. This means:

  • use_reranking=True, pro_user=False → no reranking (but caller might expect it)
  • The docstring only documents pro_user but not use_reranking

Consider either:

  1. Documenting both parameters and their interaction clearly
  2. Removing use_reranking and relying solely on pro_user for reranking control
  3. Renaming to clarify intent (e.g., enable_reranking_if_available)
📝 Suggested documentation improvement
     async def search(
         ...
     ) -> List[Dict]:
         """
         Full Search V3 pipeline
         
         Args:
             pro_user: Enable Cohere reranking (costs money, pro tier only)
+            use_reranking: Allow reranking if pro_user is True (default True).
+                          Reranking only occurs when both use_reranking=True AND pro_user=True.
         """

Comment thread backend/services/search_v3/search_engine.py
Comment thread backend/services/search_v3/search_engine.py
Comment thread backend/services/search_v3/search_engine.py
Pass original query to _extract_code_terms instead of lowercased version
- Added gauge() method to Metrics class for point-in-time values
- Use gauge instead of timing for avg_score (semantic correctness)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@backend/scripts/extended_query_test.py`:
- Around line 103-118: The loop currently uses bare excepts for
indexer.search_v2 and indexer.search_v3 which swallow all errors; change each
bare except to except Exception as e and log the exception (e.g.,
logging.exception or logger.exception) including context like the query and
whether it was V2 or V3, then set v2_results/v3_results = [] and continue so
failures remain visible in logs while preserving the fallback behavior;
reference the calls indexer.search_v2 and indexer.search_v3 and the
evaluate_results usage when adding the logging.

In `@backend/scripts/final_v3_test.py`:
- Around line 76-91: The bare except blocks around calls to
indexer.search_v2(...) and indexer.search_v3(...) hide errors and should be
changed to catch Exception as e, record/log the exception, and then set
v2_results/v3_results = [] so benchmarking continues; specifically, replace the
anonymous except in the V2 block (surrounding indexer.search_v2(query,
repo["id"], top_k=3) that currently assigns v2_results = []) and the V3 block
(surrounding indexer.search_v3(query, repo["id"], top_k=3, include_tests=False)
that assigns v3_results = []) with except Exception as e: and call an
appropriate logger (or print) to include e and contextual info (query,
repo["id"], function name) before falling back to an empty list, preserving the
timing calculations.
- Around line 10-16: The script checks os.environ.get("VOYAGE_API_KEY") but
never loads .env; call load_dotenv() from python-dotenv near the top of
backend/scripts/final_v3_test.py (after imports and before the os.environ.get
check) so environment variables from .env are loaded; ensure you import
load_dotenv if not already and place the call before the VOYAGE_API_KEY
existence check.

In `@backend/scripts/human_query_test.py`:
- Around line 11-16: The script checks for VOYAGE_API_KEY but never loads .env,
so call load_dotenv() from python-dotenv before the environment check; import
load_dotenv and invoke load_dotenv() (e.g., immediately after the existing
sys.path.insert(...) line) so that VOYAGE_API_KEY from the .env file is
available to the subsequent if not os.environ.get("VOYAGE_API_KEY") check.

In `@backend/services/indexer_optimized.py`:
- Around line 570-631: The search_v3 entrypoint currently ignores the pro_user
flag so Cohere reranking never gets enabled; update the
OptimizedCodeIndexer.search_v3 signature to accept a pro_user: bool (default
False) and forward that argument into the call to v3.search (the
SearchV3Integration returned by get_search_v3), i.e., pass pro_user=pro_user
when calling v3.search; ensure any callers are updated or the default preserves
behavior.
🧹 Nitpick comments (2)
backend/services/search_v3/code_graph_ranker.py (1)

67-135: Consider cache invalidation to avoid stale importance.
The cache key only uses repo_id, so dependency graph updates won’t refresh scores until process restart. A lightweight invalidation hook or hash-based cache key would prevent stale ranking.

backend/services/search_v3/integration.py (1)

39-77: Guard lazy initialization against concurrent calls.
Multiple threads can enter _ensure_initialized simultaneously, duplicating heavy setup. Consider an instance-level init lock with double-checking.

♻️ Suggested locking for initialization
 class SearchV3Integration:
@@
     def __init__(self):
         self._initialized = False
         self._index_embedding_provider = None
         self._voyage_embedding_provider = None
         self._search_engine = None
         self._query_understanding = None
         self._code_graph_ranker = None
+        self._init_lock = threading.Lock()
@@
     def _ensure_initialized(self):
         """Lazy initialization"""
-        if not self._initialized:
-            try:
+        if self._initialized:
+            return
+        with self._init_lock:
+            if self._initialized:
+                return
+            try:
                 # for SEARCH: use OpenAI to match existing index (1536 dim)
                 self._index_embedding_provider = get_embedding_provider("openai")
@@
-                self._initialized = True
+                self._initialized = True
                 logger.info("SearchV3Integration initialized",
                            query_model=self._index_embedding_provider.model_name,
                            index_model=self._voyage_embedding_provider.model_name)
             except Exception as e:
                 logger.error("Failed to initialize SearchV3Integration", error=str(e))
                 raise

Comment thread backend/scripts/extended_query_test.py
Comment thread backend/scripts/final_v3_test.py
Comment thread backend/scripts/final_v3_test.py
Comment thread backend/scripts/human_query_test.py
Comment thread backend/services/indexer_optimized.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/scripts/final_v3_test.py`:
- Around line 46-51: The has_test_file function is too broad because checking
"test" anywhere in fp matches names like latest.py; update has_test_file to use
path/filename pattern checks instead: split the path (use the existing fp
string) and detect test directories ('/test/' or '/tests/' segments) or
filenames where the basename starts with "test" or ends with "_test" or has
".spec" before the extension (e.g., test_*.py, *_test.py, *.spec.py); keep using
lowercase fp and only return True when one of these specific patterns matches,
otherwise return False.

In `@backend/services/indexer_optimized.py`:
- Around line 630-636: The fallback path in the except block for search_v3
currently returns await self.search_v2(...) without respecting the include_tests
flag, which can reintroduce test files; update the except handler to call
self.search_v2(...) and then, if include_tests is False, apply the same
test-filter heuristic used in search_v3 (reuse the test-detection helper or the
filtering logic from search_v3) to the V2 results before returning, ensuring the
returned results conform to include_tests; keep the capture_exception, logger,
and metrics lines as-is and perform filtering on the result object returned by
search_v2 (e.g., results = await self.search_v2(...); if not include_tests:
results = filter_out_tests(results)).
🧹 Nitpick comments (1)
backend/scripts/human_query_test.py (1)

131-134: Clamping scores to 0 hides negative signals.
Negative totals convey “worse than no results”; clamping can inflate ties and understate test-pollution impact. Consider returning the raw score (or exposing both).

♻️ Proposed tweak
-        "score": max(0, total_score),
+        "score": total_score,

Comment thread backend/scripts/final_v3_test.py Outdated
Comment thread backend/services/indexer_optimized.py Outdated
Prevent false positives like 'latest.py' matching 'test'
Added _is_test_file helper for consistent test detection

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/scripts/final_v3_test.py`:
- Around line 154-168: Check for total_queries == 0 before doing divisions: if
total_queries is zero, set avg_v2 and avg_v3 to 0 (and set winner to "TIE" or an
appropriate label) and set the V3 win rate output to 0 (or "N/A"); otherwise
compute avg_v2 = total_v2_time / total_queries, avg_v3 = total_v3_time /
total_queries and V3 win rate = total_v3_wins/total_queries*100 as currently
done. Update the code around the avg_v2/avg_v3/winner calculation and the V3 win
rate print (referencing variables avg_v2, avg_v3, winner, total_queries,
total_v3_wins) to use this guarded logic.
🧹 Nitpick comments (2)
backend/services/indexer_optimized.py (1)

642-657: Consider consolidating test-file detection logic.

  1. Redundant import: os is already imported at the module level (line 11).

  2. Code duplication: CodeGraphRanker._is_test_file in backend/services/search_v3/code_graph_ranker.py uses regex-based TEST_PATTERNS. This implementation uses hardcoded string checks, which could lead to inconsistent behavior between V3 search and V2 fallback filtering.

♻️ Proposed fix: Remove redundant import and consider reusing patterns
     def _is_test_file(self, file_path: str) -> bool:
         """Check if file is a test file (stricter pattern matching)"""
-        import os
         fp = file_path.lower()
         # test directories
         if "/test/" in fp or "/tests/" in fp:

For consistency, consider extracting the test patterns to a shared utility or reusing CodeGraphRanker.TEST_PATTERNS to ensure uniform test-file detection across V2 and V3 paths.

backend/scripts/final_v3_test.py (1)

46-62: Duplicate test-file detection logic.

This function mirrors OptimizedCodeIndexer._is_test_file in indexer_optimized.py. Consider extracting to a shared utility module to avoid maintaining identical logic in multiple places.

♻️ Example shared utility

Create a shared helper in a utility module (e.g., backend/utils/test_detection.py):

import os

def is_test_file(file_path: str) -> bool:
    """Check if file is a test file (stricter pattern matching)"""
    fp = file_path.lower()
    if "/test/" in fp or "/tests/" in fp:
        return True
    basename = os.path.basename(fp)
    if basename.startswith("test_") or basename.startswith("test."):
        return True
    if "_test." in basename or basename.endswith("_test.py"):
        return True
    if ".spec." in basename:
        return True
    return False

Then import and use in both locations.

Comment thread backend/scripts/final_v3_test.py
- Create utils/test_detection.py as single source of truth
- Reuse in indexer_optimized.py, final_v3_test.py, code_graph_ranker.py
- Remove duplicate is_test_file implementations
- All patterns now use regex for consistency

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/utils/test_detection.py`:
- Around line 10-37: The TEST_PATTERNS and is_test_file currently miss Windows
separators and allow false matches like "contest"; update is_test_file to
normalize separators (e.g., replace backslashes with forward slashes into
file_path_normalized) and use anchored regexes in TEST_PATTERNS that bind to
path boundaries (e.g., use non-capturing groups like (?:^|/) and (?:/|_|$)
around "test[s]?" and similar tokens for
spec/test/conftest/__tests__/fixtures/mocks) so they only match whole path
segments; ensure is_test_file iterates over the anchored TEST_PATTERNS (or
compiles them once) and returns False if no pattern matches.
🧹 Nitpick comments (1)
backend/services/search_v3/code_graph_ranker.py (1)

190-215: Consider delegating test filtering to the shared utility to avoid drift.
filter_test_files duplicates utils.test_detection.filter_test_files; reuse it and keep the logging here.

♻️ Suggested refactor
-from utils.test_detection import is_test_file as shared_is_test_file
+from utils.test_detection import is_test_file as shared_is_test_file, filter_test_files as shared_filter_test_files
@@
         if include_tests:
             return results
-        
-        filtered = []
-        for result in results:
-            file_path = result.get('file_path', '')
-            if not self._is_test_file(file_path):
-                filtered.append(result)
+        filtered = shared_filter_test_files(results, include_tests=False)
         
         logger.debug("Filtered test files", 
                     original_count=len(results), 
                     filtered_count=len(filtered))

Comment thread backend/utils/test_detection.py
- Normalize Windows separators (backslash to forward slash)
- Use path boundary anchors (?:^|/) and (?:/|$) to prevent false matches
- Pre-compile patterns for performance
- Fixes false positives like 'contest.py' matching 'test'
@vercel

vercel Bot commented Jan 26, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
opencodeintel Ignored Ignored Preview Jan 26, 2026 10:06pm

@DevanshuNEU DevanshuNEU merged commit b6acf5d into OpenCodeIntel:main Jan 26, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant