refactor: consolidate observability -- merge sentry.py + performance_metrics.py (OPE-87)#256
Conversation
…bility.py Three separate observability systems merged into one: BEFORE: services/sentry.py (175 lines) -- Sentry init, user context, error capture services/performance_metrics.py (79 lines) -- search/indexing stats services/observability.py (366 lines) -- logger, metrics, tracing Two 'metrics' objects with naming collision. 4 duplicate functions across sentry.py and observability.py. 3 files still importing deprecated functions from sentry.py. AFTER: services/observability.py (532 lines) -- everything, one import path Single import: from services.observability import logger, metrics, ... Changes: - Metrics class now includes record_search(), record_indexing(), get_metrics() (absorbed from PerformanceMetrics) - Added init_sentry(), _filter_events(), set_user_context(), capture_http_exception() (moved from sentry.py) - Updated 7 files to import from observability instead of sentry - dependencies.py: removed PerformanceMetrics, uses observability.metrics - Deleted services/sentry.py (175 lines) - Deleted services/performance_metrics.py (79 lines) Net: -92 lines, 2 files deleted, 1 naming collision eliminated. 284 tests pass. Closes OPE-87
|
@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughConsolidates observability into a new Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/main.py (1)
143-150: 🛠️ Refactor suggestion | 🟠 MajorOffload synchronous Sentry capture to threadpool in async handler.
The
capture_http_exceptioncall blocks the event loop with synchronous Sentry operations; this violates the coding guideline requiring async/await for I/O in backend Python files. Userun_in_threadpoolto defer the blocking call:⚡ Non-blocking capture
+from starlette.concurrency import run_in_threadpool ... async def generic_exception_handler(request: Request, exc: Exception): """ Catch-all handler for unhandled exceptions. Captures to Sentry and returns 500. """ from services.observability import capture_http_exception - capture_http_exception(request, exc, 500) + await run_in_threadpool(capture_http_exception, request, exc, 500)Also add return type hint
-> JSONResponseto the function signature.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/main.py` around lines 143 - 150, The generic_exception_handler is currently calling the synchronous capture_http_exception directly and blocking the event loop; change generic_exception_handler to declare a return type -> JSONResponse and invoke capture_http_exception inside starlette.concurrency.run_in_threadpool (or equivalent run_in_threadpool import) so the Sentry capture runs off the event loop, then return the appropriate JSONResponse; reference generic_exception_handler and capture_http_exception when making the swap.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/services/observability.py`:
- Around line 531-532: The module currently instantiates a global metrics =
Metrics() directly which violates the established singleton/getter pattern used
in dependency_analyzer.py; change this to a lazy singleton: introduce a private
_metrics = None and a get_metrics() function that creates Metrics() on first
call and returns the same instance thereafter, update callers to use
get_metrics() instead of the global metrics variable so the service follows the
common singleton accessor pattern (referencing the Metrics class, the current
metrics symbol, and the new get_metrics function).
- Around line 446-528: The Sentry helper functions are missing explicit type
annotations; add parameter and return type hints for _filter_events,
set_user_context, and capture_http_exception. Specifically, annotate
_filter_events(event: Dict[str, Any], hint: Optional[Dict[str, Any]]) ->
Optional[Dict[str, Any]]; annotate set_user_context(user_id: Optional[str] =
None, email: Optional[str] = None) -> None; and annotate
capture_http_exception(request: starlette.requests.Request, exc: Exception,
status_code: int) -> None (or use fastapi.Request if you already import that),
and import the required typing symbols (Optional, Dict, Any) at top of the
module. Ensure existing init_sentry retains its -> bool return type.
- Around line 306-442: Add explicit return type annotations to all Metrics
methods to satisfy backend typing rules: annotate __init__ as -> None;
increment, timing, gauge, record_indexing, record_search, and reset as -> None;
ensure get_metrics and get_stats have concrete return types (e.g., -> Dict[str,
Any] or -> Dict) and import Any if needed. Update the signatures for __init__,
increment, timing, gauge, record_indexing, record_search, get_metrics,
get_stats, and reset in the Metrics class accordingly and run the type checker.
- Around line 461-475: Change the hardcoded Sentry privacy flags in the
sentry_sdk.init call so send_default_pii and include_local_variables are driven
by environment variables (SENTRY_SEND_PII, SENTRY_INCLUDE_LOCAL_VARIABLES) that
default to false; update the sentry_sdk.init invocation (the call that currently
sets send_default_pii=True and include_local_variables=True) to read and parse
these env vars (e.g., interpret "true"/"1" as opt-in) and ensure production
defaults remain false, while leaving _filter_events intact for event filtering.
---
Outside diff comments:
In `@backend/main.py`:
- Around line 143-150: The generic_exception_handler is currently calling the
synchronous capture_http_exception directly and blocking the event loop; change
generic_exception_handler to declare a return type -> JSONResponse and invoke
capture_http_exception inside starlette.concurrency.run_in_threadpool (or
equivalent run_in_threadpool import) so the Sentry capture runs off the event
loop, then return the appropriate JSONResponse; reference
generic_exception_handler and capture_http_exception when making the swap.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
backend/dependencies.pybackend/main.pybackend/middleware/auth.pybackend/services/observability.pybackend/services/performance_metrics.pybackend/services/playground_limiter.pybackend/services/repo_validator.pybackend/services/sentry.pybackend/services/user_limits.py
💤 Files with no reviewable changes (2)
- backend/services/sentry.py
- backend/services/performance_metrics.py
…ion cost capture_exception() was calling logger.error() internally, but every caller (50 call sites) also calls logger.error() before/after it. Result: every error printed to stdout twice, doubling log volume for errors in production log aggregators. Now capture_exception() only sends to Sentry. Callers handle their own logging. Fallback logger.error() only fires when Sentry SDK is not installed (so errors aren't silently lost in local dev). 284 tests pass.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
backend/services/observability.py (4)
87-93:⚠️ Potential issue | 🟠 MajorShared mutable
_contextwill bleed across concurrent requests.
loggeris a module-level singleton, so_contextis shared across all concurrent requests in FastAPI. If one request callslogger.set_context(user_id="alice"), every other concurrent request's log output will includeuser_id="alice"until it's overwritten or cleared. This is a data-leak / incorrect-attribution bug in any concurrent setting.Consider using
contextvars.ContextVarto make the context request-scoped:🛡️ Sketch using contextvars
+import contextvars + +_log_context: contextvars.ContextVar[Dict[str, Any]] = contextvars.ContextVar('_log_context', default={}) + class StructuredLogger: ... - def __init__(self, name: str = "codeintel"): + def __init__(self, name: str = "codeintel") -> None: self.name = name self.level = getattr(logging, LOG_LEVEL.upper(), logging.INFO) - self._context: Dict[str, Any] = {} def _format_message(self, level: str, message: str, **kwargs) -> str: data = { "timestamp": datetime.now(timezone.utc).isoformat(), "level": level, "service": self.name, "message": message, - **self._context, + **_log_context.get(), **kwargs } ... - def set_context(self, **kwargs): + def set_context(self, **kwargs: Any) -> None: - self._context.update(kwargs) + ctx = _log_context.get().copy() + ctx.update(kwargs) + _log_context.set(ctx) - def clear_context(self): + def clear_context(self) -> None: - self._context = {} + _log_context.set({})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 87 - 93, The module-level mutable _context on the logger singleton (used by set_context and clear_context) can leak data across concurrent requests; replace the shared dict with a contextvars.ContextVar (e.g., contextvar named _context_var) so context is request-scoped: update set_context to set/merge values on the ContextVar and clear_context to reset or set an empty mapping; ensure any code reading _context uses the ContextVar's current value instead of the old _context dict.
117-183: 🛠️ Refactor suggestion | 🟠 MajorAdd return type hints to all Sentry helper functions.
set_operation_context,add_breadcrumb,capture_exception, andcapture_messageare all missing-> Nonereturn annotations.🧩 Suggested typing updates
-def set_operation_context(operation: str, **kwargs): +def set_operation_context(operation: str, **kwargs: Any) -> None: -def add_breadcrumb(message: str, category: str = "custom", level: str = "info", **data): +def add_breadcrumb(message: str, category: str = "custom", level: str = "info", **data: Any) -> None: -def capture_exception(error: Exception, **context): +def capture_exception(error: Exception, **context: Any) -> None: -def capture_message(message: str, level: str = "info", **context): +def capture_message(message: str, level: str = "info", **context: Any) -> None:As per coding guidelines, "backend/**/*.py: Type hints on all function signatures in Python backend".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 117 - 183, The four Sentry helper functions are missing explicit return type annotations; update the signatures of set_operation_context, add_breadcrumb, capture_exception, and capture_message to include "-> None" (e.g., def set_operation_context(... ) -> None:) so they follow the backend typing guideline; keep existing logic and exception handling unchanged and run type checks to verify no other signatures were missed.
54-54:⚠️ Potential issue | 🟡 MinorInconsistent and deprecated datetime usage.
Line 54 uses
datetime.utcnow()(deprecated since Python 3.12), while lines 350 and 358 usedatetime.now()(local time, no timezone). This produces timestamps in different timezones within the same module, and both are naive (no tzinfo).Standardize on
datetime.now(timezone.utc)throughout:-from datetime import datetime +from datetime import datetime, timezoneThen replace all three call sites:
- "timestamp": datetime.utcnow().isoformat(), + "timestamp": datetime.now(timezone.utc).isoformat(),- "timestamp": datetime.now().isoformat(), + "timestamp": datetime.now(timezone.utc).isoformat(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` at line 54, Replace all naive/deprecated datetime calls in backend/services/observability.py with timezone-aware UTC timestamps: change occurrences of datetime.utcnow() and datetime.now() to datetime.now(timezone.utc), and add/update the import to include timezone (from datetime import datetime, timezone). Update every call site that constructs the "timestamp" field and any other datetime.now()/utcnow() uses in this module so they consistently produce timezone-aware UTC times.
46-108: 🛠️ Refactor suggestion | 🟠 MajorAdd return type hints to all
StructuredLoggermethods.Several methods (
__init__,_log,set_context,clear_context,debug,info,warning,error,critical) are missing return type annotations.🧩 Suggested typing updates
- def __init__(self, name: str = "codeintel"): + def __init__(self, name: str = "codeintel") -> None: ... - def _log(self, level: str, level_num: int, message: str, **kwargs): + def _log(self, level: str, level_num: int, message: str, **kwargs: Any) -> None: ... - def set_context(self, **kwargs): + def set_context(self, **kwargs: Any) -> None: ... - def clear_context(self): + def clear_context(self) -> None: ... - def debug(self, message: str, **kwargs): + def debug(self, message: str, **kwargs: Any) -> None: ... - def info(self, message: str, **kwargs): + def info(self, message: str, **kwargs: Any) -> None: ... - def warning(self, message: str, **kwargs): + def warning(self, message: str, **kwargs: Any) -> None: ... - def error(self, message: str, **kwargs): + def error(self, message: str, **kwargs: Any) -> None: ... - def critical(self, message: str, **kwargs): + def critical(self, message: str, **kwargs: Any) -> None: ...As per coding guidelines, "backend/**/*.py: Type hints on all function signatures in Python backend".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 46 - 108, The listed StructuredLogger methods lack return type annotations; update the signatures for __init__, _log, set_context, clear_context, debug, info, warning, error, and critical to include explicit return types (use -> None for these methods) so every function signature in the class has type hints (e.g., def __init__(self, name: str = "codeintel") -> None:, def _log(self, level: str, level_num: int, message: str, **kwargs) -> None:, and similarly add -> None to set_context, clear_context, debug, info, warning, error, critical).
♻️ Duplicate comments (3)
backend/services/observability.py (3)
525-526: Singleton pattern formetricswas flagged in a prior review.The bare
metrics = Metrics()instantiation (instead of a lazyget_metrics()accessor perdependency_analyzer.pyconventions) was already raised. Noting as a duplicate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 525 - 526, Replace the eager singleton instantiation "metrics = Metrics()" with a lazy accessor following the project's convention: remove the global instantiation and add or use a function named get_metrics() that creates and returns a Metrics instance on first call and reuses it thereafter; update any code that imports or references the global "metrics" symbol to call get_metrics() instead so the Metrics object is created lazily per dependency_analyzer.py conventions.
440-479: PII and type-hint issues on Sentry init and helpers were flagged in a prior review.The hardcoded
send_default_pii=True/include_local_variables=True(line 460, 468) and missing type annotations on_filter_events,set_user_context, andcapture_http_exceptionwere already raised. Noting as duplicates.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 440 - 479, init_sentry currently hardcodes send_default_pii=True and include_local_variables=True and helper functions lack type hints; change both PII-related flags to be configurable via environment variables (e.g., SENTRY_SEND_PII, SENTRY_INCLUDE_LOCAL_VARS) with safe defaults (False) and use those env-derived booleans in sentry_sdk.init, and add proper type annotations to the helper functions _filter_events, set_user_context, and capture_http_exception (use correct Sentry SDK types / typing.Optional, dict, Any where appropriate) so static typing and privacy controls are explicit; reference symbols to change: init_sentry, _filter_events, set_user_context, capture_http_exception.
300-326: Return type hints onMetricsmethods are still missing (from prior review).This was flagged in a previous review and appears unaddressed. Noting as a duplicate — please see the earlier comment for the full set of methods needing
-> None/-> Dict[str, Any]annotations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 300 - 326, Add missing return type annotations on the Metrics class methods: annotate __init__ as -> None and add explicit return types for public methods mentioned in the docstring such as increment, timing, record_search, record_indexing (use -> None for mutators) and get_metrics, get_stats (use -> Dict[str, Any] or the appropriate mapping type); update the method signatures in class Metrics to include these type hints so linters and prior review requirements are satisfied.
🧹 Nitpick comments (4)
backend/services/observability.py (4)
331-337: Considerdeque(maxlen=1000)for timings instead of manual list trimming.The timing method manually slices the list when it exceeds 1000 entries (O(n) copy). Since you already use
deque(maxlen=...)for_indexing_timesand_search_times, apply the same pattern here for consistency and O(1) bounded appends.- self._timings: Dict[str, list] = {} + self._timings: Dict[str, deque] = {} ... def timing(self, name: str, value_ms: float): if name not in self._timings: - self._timings[name] = [] + self._timings[name] = deque(maxlen=1000) self._timings[name].append(value_ms) - if len(self._timings[name]) > 1000: - self._timings[name] = self._timings[name][-1000:]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 331 - 337, Replace the manual list management in timing(...) with a bounded deque to avoid O(n) slicing: initialize self._timings entries as collections.deque(maxlen=1000) (similar to _indexing_times and _search_times) and in timing(self, name: str, value_ms: float) simply append to that deque; update any code that assumes list behavior to handle deque if necessary. Ensure imports include collections.deque and that the symbol names _timings and timing are used exactly as in the diff.
243-296: Reduce duplication betweenasync_wrapperandsync_wrapper.These two wrappers share identical context-extraction, timing, logging, and exception-handling logic (~20 lines each). Extract the shared pre/post logic into a helper to keep them DRY.
♻️ Sketch
def _extract_trace_context(kwargs: dict) -> dict: return {k: v for k, v in kwargs.items() if k in ('repo_id', 'user_id', 'query', 'file_path')} def _on_success(operation: str, start: float, context: dict) -> None: duration = time.perf_counter() - start logger.info(f"{operation} completed successfully", duration_s=round(duration, 2), **context) def _on_error(operation: str, start: float, context: dict, e: Exception) -> None: duration = time.perf_counter() - start capture_exception(e, operation=operation, duration_s=round(duration, 2), **context)Then both wrappers become thin shells calling these helpers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 243 - 296, The async_wrapper and sync_wrapper in the decorator share identical context extraction, timing, logging and error-handling logic; factor that shared logic into small helpers (e.g., _extract_trace_context(kwargs: dict) -> dict, _on_success(operation, start, context), _on_error(operation, start, context, e)) and call them from both async_wrapper and sync_wrapper inside the decorator function. Keep existing calls to set_operation_context and add_breadcrumb but use _extract_trace_context to build context, call set_operation_context/add_breadcrumb before invoking func, record start with time.perf_counter(), then on success call _on_success(operation, start, context) and return result, and on exception call _on_error(operation, start, context, e) then re-raise; ensure function identity (wraps) and asyncio.iscoroutinefunction(func) behavior remain unchanged.
366-407: Duplicated average-speed computation and always-"healthy"status.Two issues in
get_metrics():
Duplicated calculation — The average indexing speed is computed on lines 379-380, then recomputed identically on lines 402-403. Extract it into a local variable.
"health": "healthy"is unconditional — This always returns"healthy"regardless of actual system state, which is misleading for a dashboard endpoint. Either derive it from real conditions or remove it.♻️ Proposed fix for the duplication
def get_metrics(self) -> Dict: indexing_speeds = [m["speed"] for m in self._indexing_times] search_durations = [m["duration"] for m in self._search_times] + avg_indexing_speed = ( + sum(indexing_speeds) / len(indexing_speeds) + if indexing_speeds else 0 + ) cache_hit_rate = ( ... ) return { "indexing": { "total_operations": len(self._indexing_times), - "avg_speed_functions_per_sec": ( - sum(indexing_speeds) / len(indexing_speeds) - if indexing_speeds else 0 - ), + "avg_speed_functions_per_sec": avg_indexing_speed, ... }, ... "summary": { - "health": "healthy", "cache_working": cache_hit_rate > 0, - "indexing_performance": ( - "good" if ( - sum(indexing_speeds) / len(indexing_speeds) - if indexing_speeds else 0 - ) > 10 else "needs_improvement" - ), + "indexing_performance": ( + "good" if avg_indexing_speed > 10 + else "needs_improvement" + ), }, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 366 - 407, The get_metrics() function duplicates the average indexing-speed computation and returns a hard-coded "healthy" status; fix by computing avg_indexing_speed once from indexing_speeds (e.g., avg_indexing_speed = sum(indexing_speeds)/len(indexing_speeds) if indexing_speeds else 0) and reuse that variable in the "indexing" and "summary.indexing_performance" entries, and replace the unconditional "health": "healthy" with a derived value (for example use "healthy" only if avg_indexing_speed exceeds a threshold and cache_hit_rate is above 0, otherwise "degraded") so the dashboard reflects real conditions; update references to self._indexing_times, indexing_speeds, cache_hit_rate and avg_indexing_speed inside get_metrics().
488-498: Merge the twoif exception_values:blocks.Lines 489 and 495 perform the same guard. Combine them into a single block for clarity.
♻️ Suggested merge
exception_values = event.get("exception", {}).get("values", []) if exception_values: exception_value = str(exception_values[0].get("value", "")) bot_paths = ["/wp-admin", "/wp-login", "/.env", "/config", "/admin", "/phpmyadmin", "/.git"] if any(path in exception_value for path in bot_paths): return None - if exception_values: exception_type = exception_values[0].get("type", "") if exception_type in ("RequestValidationError", "ValidationError"): return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 488 - 498, The two identical guards using exception_values should be merged into a single if block: replace the two separate "if exception_values:" blocks with one that first computes exception_value and checks bot_paths (bot_paths, exception_value) and then computes exception_type to check ("RequestValidationError","ValidationError"), returning None in either case; keep the same variables exception_values, exception_value, bot_paths, and exception_type and preserve the early returns so behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/services/observability.py`:
- Around line 353-364: record_search currently duplicates cache-hit/miss
accounting that cache.get_search_results() already updates via _counters; remove
the duplicate fields by stopping updates to _cache_hits/_cache_misses in
record_search and change get_metrics (and any code exposing dashboard values) to
return the canonical counts from _counters["cache_hits"] and
_counters["cache_misses"] instead; update references to
_cache_hits/_cache_misses (in record_search and get_metrics) and ensure only the
existing metrics.increment("cache_hits"/"cache_misses") calls in
get_search_results remain responsible for those counters.
- Around line 163-168: Replace deprecated sentry_sdk.push_scope() usages with
sentry_sdk.new_scope() to be compatible with sentry-sdk 2.x: locate each
occurrence of "sentry_sdk.push_scope()" (three places in observability.py,
including the block that sets scope extras and captures exceptions) and change
the context manager to use "sentry_sdk.new_scope()" (i.e., with
sentry_sdk.new_scope() as scope:), keeping the existing scope.set_extra(...) and
sentry_sdk.capture_exception(...) logic intact.
---
Outside diff comments:
In `@backend/services/observability.py`:
- Around line 87-93: The module-level mutable _context on the logger singleton
(used by set_context and clear_context) can leak data across concurrent
requests; replace the shared dict with a contextvars.ContextVar (e.g.,
contextvar named _context_var) so context is request-scoped: update set_context
to set/merge values on the ContextVar and clear_context to reset or set an empty
mapping; ensure any code reading _context uses the ContextVar's current value
instead of the old _context dict.
- Around line 117-183: The four Sentry helper functions are missing explicit
return type annotations; update the signatures of set_operation_context,
add_breadcrumb, capture_exception, and capture_message to include "-> None"
(e.g., def set_operation_context(... ) -> None:) so they follow the backend
typing guideline; keep existing logic and exception handling unchanged and run
type checks to verify no other signatures were missed.
- Line 54: Replace all naive/deprecated datetime calls in
backend/services/observability.py with timezone-aware UTC timestamps: change
occurrences of datetime.utcnow() and datetime.now() to
datetime.now(timezone.utc), and add/update the import to include timezone (from
datetime import datetime, timezone). Update every call site that constructs the
"timestamp" field and any other datetime.now()/utcnow() uses in this module so
they consistently produce timezone-aware UTC times.
- Around line 46-108: The listed StructuredLogger methods lack return type
annotations; update the signatures for __init__, _log, set_context,
clear_context, debug, info, warning, error, and critical to include explicit
return types (use -> None for these methods) so every function signature in the
class has type hints (e.g., def __init__(self, name: str = "codeintel") ->
None:, def _log(self, level: str, level_num: int, message: str, **kwargs) ->
None:, and similarly add -> None to set_context, clear_context, debug, info,
warning, error, critical).
---
Duplicate comments:
In `@backend/services/observability.py`:
- Around line 525-526: Replace the eager singleton instantiation "metrics =
Metrics()" with a lazy accessor following the project's convention: remove the
global instantiation and add or use a function named get_metrics() that creates
and returns a Metrics instance on first call and reuses it thereafter; update
any code that imports or references the global "metrics" symbol to call
get_metrics() instead so the Metrics object is created lazily per
dependency_analyzer.py conventions.
- Around line 440-479: init_sentry currently hardcodes send_default_pii=True and
include_local_variables=True and helper functions lack type hints; change both
PII-related flags to be configurable via environment variables (e.g.,
SENTRY_SEND_PII, SENTRY_INCLUDE_LOCAL_VARS) with safe defaults (False) and use
those env-derived booleans in sentry_sdk.init, and add proper type annotations
to the helper functions _filter_events, set_user_context, and
capture_http_exception (use correct Sentry SDK types / typing.Optional, dict,
Any where appropriate) so static typing and privacy controls are explicit;
reference symbols to change: init_sentry, _filter_events, set_user_context,
capture_http_exception.
- Around line 300-326: Add missing return type annotations on the Metrics class
methods: annotate __init__ as -> None and add explicit return types for public
methods mentioned in the docstring such as increment, timing, record_search,
record_indexing (use -> None for mutators) and get_metrics, get_stats (use ->
Dict[str, Any] or the appropriate mapping type); update the method signatures in
class Metrics to include these type hints so linters and prior review
requirements are satisfied.
---
Nitpick comments:
In `@backend/services/observability.py`:
- Around line 331-337: Replace the manual list management in timing(...) with a
bounded deque to avoid O(n) slicing: initialize self._timings entries as
collections.deque(maxlen=1000) (similar to _indexing_times and _search_times)
and in timing(self, name: str, value_ms: float) simply append to that deque;
update any code that assumes list behavior to handle deque if necessary. Ensure
imports include collections.deque and that the symbol names _timings and timing
are used exactly as in the diff.
- Around line 243-296: The async_wrapper and sync_wrapper in the decorator share
identical context extraction, timing, logging and error-handling logic; factor
that shared logic into small helpers (e.g., _extract_trace_context(kwargs: dict)
-> dict, _on_success(operation, start, context), _on_error(operation, start,
context, e)) and call them from both async_wrapper and sync_wrapper inside the
decorator function. Keep existing calls to set_operation_context and
add_breadcrumb but use _extract_trace_context to build context, call
set_operation_context/add_breadcrumb before invoking func, record start with
time.perf_counter(), then on success call _on_success(operation, start, context)
and return result, and on exception call _on_error(operation, start, context, e)
then re-raise; ensure function identity (wraps) and
asyncio.iscoroutinefunction(func) behavior remain unchanged.
- Around line 366-407: The get_metrics() function duplicates the average
indexing-speed computation and returns a hard-coded "healthy" status; fix by
computing avg_indexing_speed once from indexing_speeds (e.g., avg_indexing_speed
= sum(indexing_speeds)/len(indexing_speeds) if indexing_speeds else 0) and reuse
that variable in the "indexing" and "summary.indexing_performance" entries, and
replace the unconditional "health": "healthy" with a derived value (for example
use "healthy" only if avg_indexing_speed exceeds a threshold and cache_hit_rate
is above 0, otherwise "degraded") so the dashboard reflects real conditions;
update references to self._indexing_times, indexing_speeds, cache_hit_rate and
avg_indexing_speed inside get_metrics().
- Around line 488-498: The two identical guards using exception_values should be
merged into a single if block: replace the two separate "if exception_values:"
blocks with one that first computes exception_value and checks bot_paths
(bot_paths, exception_value) and then computes exception_type to check
("RequestValidationError","ValidationError"), returning None in either case;
keep the same variables exception_values, exception_value, bot_paths, and
exception_type and preserve the early returns so behavior is unchanged.
1. Added return type annotations to all Metrics methods and Sentry helper functions (CLAUDE.md: type hints on all Python functions). 2. send_default_pii and include_local_variables now driven by env vars (SENTRY_SEND_PII, SENTRY_INCLUDE_LOCAL_VARS), defaulting to false. Deployers must explicitly opt in to send user emails to Sentry. Skipped: lazy singleton for metrics (same pattern as logger, pure in-memory, no side effects), async capture_http_exception (Sentry SDK already enqueues to background thread, non-blocking by design). 284 tests pass.
1. Double cache counting: record_search no longer tracks cache_hits/ cache_misses (cache.py already does via metrics.increment). get_metrics now reads from _counters. Eliminates double counting. 2. push_scope -> new_scope: 3 occurrences updated for sentry-sdk 2.x compatibility. Test warnings dropped from 54 to 44. 3. datetime.utcnow() -> datetime.now(timezone.utc): 3 occurrences. utcnow() deprecated in Python 3.12+. 4. Type annotations: added -> None to 13 functions (StructuredLogger methods + Sentry helpers) per CLAUDE.md. 5. deque for _timings: O(1) append instead of O(n) slice for cap. 6. get_metrics: avg_indexing_speed computed once, reused in summary. 7. _filter_events: merged duplicate if blocks into one. Skipped (3rd+ time): lazy singleton, contextvars, async/sync wrapper factoring. 284 tests pass. 10 fewer warnings.
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
backend/services/observability.py (3)
518-519: Globalmetricsstill bypasses the established singleton-getter pattern.Per the prior review at lines 518–519 (still open), the
dependency_analyzer.pypattern requires aget_metrics()lazy accessor rather than direct module-level instantiation.As per coding guidelines: "All new services should follow the singleton pattern used in
dependency_analyzer.py".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 518 - 519, Replace the module-level instantiation "metrics = Metrics()" with a lazy singleton accessor like get_metrics() used in dependency_analyzer.py: remove the global Metrics() creation, add a module-scoped private variable (e.g., _metrics = None) and implement get_metrics() that constructs Metrics() on first call and returns the same instance thereafter; update any callers that reference the module-level metrics to call get_metrics() instead to preserve the established singleton-getter pattern.
401-416:get_statsreturn type is still rawDict.Per the prior review at lines 300–426 (still open), this should be
-> Dict[str, Any]for concrete type information.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 401 - 416, The get_stats method currently uses a raw Dict return type; update its signature from def get_stats(self) -> Dict: to def get_stats(self) -> Dict[str, Any]: and add/ensure an import for Any from typing at the top of the file; keep the implementation unchanged but confirm the returned structure (using self._counters, self._gauges, and self._timings) matches the Dict[str, Any] annotation so static checkers accept it.
360-399:⚠️ Potential issue | 🟡 Minor
-> Dictreturn type (outstanding from prior review) +total_operationsis misleadingly capped.Two issues here:
Type annotation (pre-existing flag):
get_metricsreturns rawDict; should beDict[str, Any].
total_operationscaps at 100:len(self._indexing_times)returns at mostdeque.maxlen(100), so after the 100th indexing run the dashboard permanently readstotal_operations: 100. Compare with_total_searches, which correctly counts every call. Add a parallel_total_indexing_operationscounter.🛠️ Suggested fix for total_operations
def __init__(self) -> None: ... + self._total_indexing_operations: int = 0 self._total_searches: int = 0 def record_indexing(self, repo_id: str, duration: float, function_count: int) -> None: self._indexing_times.append({...}) + self._total_indexing_operations += 1 def get_metrics(self) -> Dict[str, Any]: ... return { "indexing": { - "total_operations": len(self._indexing_times), + "total_operations": self._total_indexing_operations, ...Also update
reset():def reset(self) -> None: ... self._indexing_times.clear() + self._total_indexing_operations = 0 self._total_searches = 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 360 - 399, get_metrics currently types as Dict and reports total_operations using len(self._indexing_times) which is capped by the deque maxlen; change the signature to return Dict[str, Any], add a persistent counter self._total_indexing_operations that is incremented wherever indexing entries are appended to self._indexing_times (the same place _total_searches is incremented for searches), use that counter for "total_operations" in get_metrics instead of len(self._indexing_times>, and update reset() to set self._total_indexing_operations back to 0 and include it in the metrics output; references: get_metrics, _indexing_times, _total_searches, reset, and new _total_indexing_operations.
🧹 Nitpick comments (1)
backend/services/observability.py (1)
430-473:init_sentryuses bareprint()while the module'sloggeris already available.The module-level
loggeris initialized at line 112, well beforeinit_sentryis defined, so there's no bootstrap concern. Usinglogger.info/logger.warninghere would keep observability output consistent and honourLOG_LEVEL.♻️ Suggested replacement
- print("[INFO] Sentry DSN not configured - error tracking disabled") + logger.info("Sentry DSN not configured - error tracking disabled") return False ... - print(f"[OK] Sentry initialized (environment: {environment})") + logger.info("Sentry initialized", environment=environment) return True except ImportError: - print("[WARN] sentry-sdk not installed - error tracking disabled") + logger.warning("sentry-sdk not installed - error tracking disabled") return False except Exception as e: - print(f"[WARN] Failed to initialize Sentry: {e}") + logger.warning("Failed to initialize Sentry", error=str(e)) return False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 430 - 473, Replace the bare print() calls in init_sentry with the module-level logger methods to keep output consistent with LOG_LEVEL: use logger.info for informational messages (e.g., the "[INFO] Sentry DSN not configured" and "[OK] Sentry initialized" messages) and logger.warning for warning cases (e.g., when sentry-sdk is missing or initialization fails), and include the exception details in the failure warning (init_sentry references logger and uses _filter_events, FastApiIntegration, StarletteIntegration - update those print(...) occurrences to logger.info(...) / logger.warning(...), preserving the same message text and any interpolated variables).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/services/observability.py`:
- Line 318: The _timings attribute is annotated as Dict[str, list] but timing()
stores collections.deque objects; update the type to reflect deque entries
(e.g., Dict[str, Deque[float]]), import Deque from typing (and ensure deque is
imported from collections) and update any related annotations/usages in the
class (notably the _timings declaration and any references in timing()) so the
type matches the actual deque(maxlen=1000) values being stored.
- Around line 496-502: The function set_user_context currently always calls
sentry_sdk.set_user with email, which leaks PII; change it to only include the
email when the Sentry PII opt-in is enabled (e.g., read settings.SENTRY_SEND_PII
or os.getenv("SENTRY_SEND_PII") and check truthiness), otherwise call
sentry_sdk.set_user with only the user id (or clear the email field). Update
set_user_context to consult that flag before adding "email" to the dict and keep
the ImportError try/except; ensure callers (e.g., middleware/auth.py that calls
set_user_context) do not bypass this check.
- Around line 446-447: Add the two missing optional environment variables to the
startup checks by updating the OPTIONAL_VARS collection in startup_checks.py to
include "SENTRY_SEND_PII" and "SENTRY_INCLUDE_LOCAL_VARS" entries following the
same structure/format used by existing optional vars (name string and any
descriptive comment or default guidance), so the observability.py usage of
SENTRY_SEND_PII and SENTRY_INCLUDE_LOCAL_VARS is registered by the startup
checks.
---
Duplicate comments:
In `@backend/services/observability.py`:
- Around line 518-519: Replace the module-level instantiation "metrics =
Metrics()" with a lazy singleton accessor like get_metrics() used in
dependency_analyzer.py: remove the global Metrics() creation, add a
module-scoped private variable (e.g., _metrics = None) and implement
get_metrics() that constructs Metrics() on first call and returns the same
instance thereafter; update any callers that reference the module-level metrics
to call get_metrics() instead to preserve the established singleton-getter
pattern.
- Around line 401-416: The get_stats method currently uses a raw Dict return
type; update its signature from def get_stats(self) -> Dict: to def
get_stats(self) -> Dict[str, Any]: and add/ensure an import for Any from typing
at the top of the file; keep the implementation unchanged but confirm the
returned structure (using self._counters, self._gauges, and self._timings)
matches the Dict[str, Any] annotation so static checkers accept it.
- Around line 360-399: get_metrics currently types as Dict and reports
total_operations using len(self._indexing_times) which is capped by the deque
maxlen; change the signature to return Dict[str, Any], add a persistent counter
self._total_indexing_operations that is incremented wherever indexing entries
are appended to self._indexing_times (the same place _total_searches is
incremented for searches), use that counter for "total_operations" in
get_metrics instead of len(self._indexing_times>, and update reset() to set
self._total_indexing_operations back to 0 and include it in the metrics output;
references: get_metrics, _indexing_times, _total_searches, reset, and new
_total_indexing_operations.
---
Nitpick comments:
In `@backend/services/observability.py`:
- Around line 430-473: Replace the bare print() calls in init_sentry with the
module-level logger methods to keep output consistent with LOG_LEVEL: use
logger.info for informational messages (e.g., the "[INFO] Sentry DSN not
configured" and "[OK] Sentry initialized" messages) and logger.warning for
warning cases (e.g., when sentry-sdk is missing or initialization fails), and
include the exception details in the failure warning (init_sentry references
logger and uses _filter_events, FastApiIntegration, StarletteIntegration -
update those print(...) occurrences to logger.info(...) / logger.warning(...),
preserving the same message text and any interpolated variables).
1. _timings type: Dict[str, list] -> Dict[str, Deque[float]] to match actual deque storage. Added Deque to typing imports. 2. set_user_context: email now only sent to Sentry when SENTRY_SEND_PII is explicitly enabled. Was bypassing the PII flag we added. 3. startup_checks: added SENTRY_SEND_PII and SENTRY_INCLUDE_LOCAL_VARS to OPTIONAL_VARS so missing config gets logged at startup. 4. get_metrics/get_stats: return type -> Dict[str, Any] for precision. 5. _total_indexing_ops: persistent counter replaces len(deque) which was capped at 100. Dashboard now shows true total. Skipped: lazy singleton (4th time), print->logger in init_sentry (runs before logger is ready at bootstrap). 284 tests pass.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/services/observability.py (3)
188-189:⚠️ Potential issue | 🟠 MajorMissing return type on
track_timeviolates the type-hints guideline.As a
@contextmanager, the correct annotation is-> Generator[None, None, None](requiresGeneratorin thetypingimport).**tagsshould also be**tags: Any.🛠️ Proposed fix
-from typing import Optional, Any, Dict, Deque +from typing import Optional, Any, Dict, Deque, Generator `@contextmanager` -def track_time(operation: str, **tags): +def track_time(operation: str, **tags: Any) -> Generator[None, None, None]:As per coding guidelines, "backend/**/*.py: Type hints on all function signatures in Python backend".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 188 - 189, Add missing type hints to the track_time context manager: annotate its signature as def track_time(operation: str, **tags: Any) -> Generator[None, None, None] and ensure Generator and Any are imported from typing; update the typing import list in the module accordingly.
234-297:⚠️ Potential issue | 🟠 Major
trace_operationmissing return type;asyncioimport belongs at module top.The function signature
def trace_operation(operation: str):has no return type annotation. It should be-> Callable[..., Any](requires addingCallableto thetypingimport). Theimport asyncioat line 292 is inside thedecoratorclosure — it runs on every decorated function — and should be at the module top level per PEP 8.🛠️ Proposed fix
+import asyncio import os import sys ... -from typing import Optional, Any, Dict, Deque, Generator +from typing import Optional, Any, Dict, Deque, Generator, Callable-def trace_operation(operation: str): +def trace_operation(operation: str) -> Callable[..., Any]: def decorator(func): ... - import asyncio if asyncio.iscoroutinefunction(func):As per coding guidelines, "backend/**/*.py: Type hints on all function signatures in Python backend".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 234 - 297, The trace_operation decorator is missing a return type and imports asyncio inside the decorator closure; update the signature of trace_operation to include a return type annotation (-> Callable[..., Any]) and add Callable and Any to the module typing imports, and move the import asyncio to the module top-level; ensure existing inner symbols (decorator, async_wrapper, sync_wrapper) remain unchanged and the final return uses asyncio.iscoroutinefunction(func) after the module-level import.
201-232:⚠️ Potential issue | 🟡 MinorThree related issues in the
track_timespan lifecycle.
Deprecated
description=parameter (line 205):descriptionwas deprecated in sentry-sdk 2.15.0; from 2.15.0 on, onlynameshould be used, anddescriptionwill be removed in 3.0.0.Manual
span.finish()doesn't mark the span as failed on exception (lines 222-224): The context manager form ofstart_spancorrectly handles exceptions, marking the span as failed if an exception occurs. The current manualspan.finish()infinallyalways finishes with a success status regardless of whether an exception was raised.Misleading
"completed"log on the exception path (line 220): Thefinallyblock unconditionally logs"{operation} completed"even when the caller's code raised an exception.🛠️ Proposed fix — capture exception state in finally
+ _exc: Optional[BaseException] = None try: yield + except BaseException as _e: + _exc = _e + raise finally: duration = time.perf_counter() - start duration_ms = round(duration * 1000, 2) - # Log completion - logger.debug(f"{operation} completed", duration_ms=duration_ms, **tags) + logger.debug( + f"{operation} {'completed' if _exc is None else 'failed'}", + duration_ms=duration_ms, + **tags, + ) - # Finish Sentry span if span: + if _exc is not None: + span.set_status("internal_error") span.finish()And rename the deprecated kwarg:
- span = sentry_sdk.start_span(op=operation, description=operation) + span = sentry_sdk.start_span(op=operation, name=operation)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 201 - 232, In track_time, replace the deprecated description kwarg on sentry_sdk.start_span with name=operation and switch to the context-manager form (e.g., "with sentry_sdk.start_span(name=operation) as span:") so Sentry records exceptions automatically instead of calling span.finish() manually; capture the exception state in the try/except around the yielded block and, in the finally, log either logger.debug(f"{operation} completed", ...) on success or logger.error(f"{operation} failed", ...) when an exception occurred, and add corresponding add_breadcrumb calls (e.g., "Completed: {operation}" vs "Failed: {operation}") so breadcrumbs reflect the correct outcome.
♻️ Duplicate comments (1)
backend/services/observability.py (1)
524-525: Singleton pattern still not implemented formetrics.
metrics = Metrics()is a direct module-level instantiation rather than a lazy getter, which does not follow the established singleton/getter pattern used independency_analyzer.pyas required by the service coding guidelines.As per coding guidelines, "backend/services/**/*.py: All new services should follow the singleton pattern used in
dependency_analyzer.py".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 524 - 525, Replace the module-level instantiation metrics = Metrics() with a lazy singleton getter that mirrors the pattern used in dependency_analyzer.py: remove the direct Metrics() call, add a get_metrics() function that creates and caches a single Metrics instance (e.g., a module-level _metrics variable set on first call) and return that instance on subsequent calls, and update any local usages to call get_metrics() instead of referencing metrics directly; ensure the symbol names Metrics and get_metrics are used consistently to match the established pattern.
🧹 Nitpick comments (1)
backend/services/observability.py (1)
326-328:increment's**tagsparameter is silently discarded — dead API surface.
**tagsis accepted in the signature but never read or stored in the method body. Callers passing tags (e.g., for filtering or dimensions) will get no effect and no error. Either propagate tags to a_tag_countersstore or remove the parameter to avoid a misleading API.🛠️ Proposed fix (remove unused parameter)
- def increment(self, name: str, value: int = 1, **tags) -> None: + def increment(self, name: str, value: int = 1) -> None: """Increment a counter""" self._counters[name] = self._counters.get(name, 0) + value🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/observability.py` around lines 326 - 328, The increment method accepts **tags but silently drops them; either persist tags or remove the param. Fix by updating observability.increment to handle tags: add a _tag_counters (or use existing store) and key by (name, frozenset(tags.items())) or similar, incrementing self._tag_counters[(name, tag_key)] alongside self._counters, or if you want to remove the dead API surface, remove the **tags parameter from increment and update all callers; reference the increment method and _counters (and create or reuse _tag_counters) when implementing the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/services/observability.py`:
- Around line 188-189: Add missing type hints to the track_time context manager:
annotate its signature as def track_time(operation: str, **tags: Any) ->
Generator[None, None, None] and ensure Generator and Any are imported from
typing; update the typing import list in the module accordingly.
- Around line 234-297: The trace_operation decorator is missing a return type
and imports asyncio inside the decorator closure; update the signature of
trace_operation to include a return type annotation (-> Callable[..., Any]) and
add Callable and Any to the module typing imports, and move the import asyncio
to the module top-level; ensure existing inner symbols (decorator,
async_wrapper, sync_wrapper) remain unchanged and the final return uses
asyncio.iscoroutinefunction(func) after the module-level import.
- Around line 201-232: In track_time, replace the deprecated description kwarg
on sentry_sdk.start_span with name=operation and switch to the context-manager
form (e.g., "with sentry_sdk.start_span(name=operation) as span:") so Sentry
records exceptions automatically instead of calling span.finish() manually;
capture the exception state in the try/except around the yielded block and, in
the finally, log either logger.debug(f"{operation} completed", ...) on success
or logger.error(f"{operation} failed", ...) when an exception occurred, and add
corresponding add_breadcrumb calls (e.g., "Completed: {operation}" vs "Failed:
{operation}") so breadcrumbs reflect the correct outcome.
---
Duplicate comments:
In `@backend/services/observability.py`:
- Around line 524-525: Replace the module-level instantiation metrics =
Metrics() with a lazy singleton getter that mirrors the pattern used in
dependency_analyzer.py: remove the direct Metrics() call, add a get_metrics()
function that creates and caches a single Metrics instance (e.g., a module-level
_metrics variable set on first call) and return that instance on subsequent
calls, and update any local usages to call get_metrics() instead of referencing
metrics directly; ensure the symbol names Metrics and get_metrics are used
consistently to match the established pattern.
---
Nitpick comments:
In `@backend/services/observability.py`:
- Around line 326-328: The increment method accepts **tags but silently drops
them; either persist tags or remove the param. Fix by updating
observability.increment to handle tags: add a _tag_counters (or use existing
store) and key by (name, frozenset(tags.items())) or similar, incrementing
self._tag_counters[(name, tag_key)] alongside self._counters, or if you want to
remove the dead API surface, remove the **tags parameter from increment and
update all callers; reference the increment method and _counters (and create or
reuse _tag_counters) when implementing the change.
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Problem
Three separate observability systems with overlapping functions and a naming collision:
services/sentry.py(175 lines) -- Sentry init + 4 deprecated functions duplicated in observability.pyservices/performance_metrics.py(79 lines) -- PerformanceMetrics classservices/observability.py(366 lines) -- StructuredLogger, Metrics, tracingTwo different objects both named
metricsin different files. 3 services still importing deprecatedcapture_exceptionfrom sentry.py instead of observability.Fix
Single module. Single import path. Zero duplication.
What moved where
Files deleted
services/sentry.py(175 lines)services/performance_metrics.py(79 lines)Import migrations (7 files)
Result
observability.py: 532 lines (above 200 guideline but cohesive -- single hub for all observability)
Net: -92 lines, 2 files deleted, naming collision eliminated. 284 tests pass.
Closes OPE-87
Summary by CodeRabbit
Refactoring
Infrastructure
Chores