⚡ Bolt: Optimize ThreadSafeCache with O(1) LRU eviction#467
⚡ Bolt: Optimize ThreadSafeCache with O(1) LRU eviction#467RohanExploit wants to merge 4 commits intomainfrom
Conversation
Replaces the manual O(N) LFU-ish eviction logic in ThreadSafeCache with a standard OrderedDict implementation, providing O(1) get/set/evict operations. Updates backend/routers/detection.py to use this robust cache instead of an ad-hoc dictionary, fixing potential race conditions and improving performance for heavy ML endpoints. Adds comprehensive tests for LRU behavior, TTL, and thread safety.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughReplaces ad-hoc dict caches with a ThreadSafeCache using OrderedDict-backed storage for O(1) LRU eviction and TTL handling; integrates the new cache into the detection router, adds tests for eviction/TTL/thread-safety, updates httpx version constraints, and adds an import-verification script. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (6)
backend/cache.py (3)
34-44: Remove commented-out debug prints inget.These leftover debug statements add noise. If debug logging is needed, use the existing
loggerinstead.Proposed cleanup
if current_time - timestamp < self._ttl: # Move to end (MRU) self._data.move_to_end(key) - # print(f"DEBUG: get({key}) hit. Order: {list(self._data.keys())}") return value else: # Expired entry - remove it del self._data[key] - # print(f"DEBUG: get({key}) expired. Order: {list(self._data.keys())}") - # print(f"DEBUG: get({key}) miss. Order: {list(self._data.keys())}") return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cache.py` around lines 34 - 44, Remove the commented-out debug print statements inside the get method: delete the three lines with print(f"DEBUG: get({key}) ...") so the method no longer contains leftover debug prints; if runtime debugging is ever needed, replace those prints with logger.debug calls (use the existing logger and include key and current order via list(self._data.keys()) as needed). Ensure changes are made in the get function that checks current_time - timestamp < self._ttl and the branches that move_to_end, delete expired entries, and the miss path.
87-103:get_statsperforms an O(N) scan for expired entries.This is acceptable for a monitoring/stats endpoint that isn't on the hot path, but worth noting that under high cache sizes (e.g.,
max_size=1000foruser_upload_cacheon line 125), frequent calls could add up. Consider whether expired entry count is necessary, or if it could be tracked incrementally.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cache.py` around lines 87 - 103, get_stats currently does an O(N) scan of self._data to count expired entries (using _data and _ttl), which can be costly at large cache sizes (e.g., user_upload_cache with _max_size=1000); change get_stats to avoid the linear scan by either (a) removing expired_entries from the returned dict, or (b) maintain an incremental counter (e.g., self._expired_count) updated inside Cache methods that mutate entries (set/evict/delete/cleanup) and return that counter in get_stats; update set, delete/evict, and any background cleanup logic to increment/decrement this counter whenever an entry transitions to/from expired so get_stats becomes O(1).
62-67: Remove unusedpoppedvariable (Ruff F841).The result of
popitemis captured but never used. Also, the commented-out debug prints throughout the class (lines 37, 42, 44, 65, 67) should be cleaned up.Proposed fix for eviction block
# Evict if over capacity if len(self._data) > self._max_size: # Remove first item (LRU) - popped = self._data.popitem(last=False) - # print(f"DEBUG: Evicted {popped[0]}. Order: {list(self._data.keys())}") - - # print(f"DEBUG: set({key}). Order: {list(self._data.keys())}") + self._data.popitem(last=False)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cache.py` around lines 62 - 67, The eviction code stores the result of self._data.popitem(last=False) in an unused variable `popped` which triggers Ruff F841; replace that assignment with a discarded result (e.g., call popitem(...) without assigning) or assign to `_` and remove the commented debug print lines scattered in the class (the debug prints around the set/get logic). Update the method that manipulates self._data (the set/eviction block) to call self._data.popitem(last=False) without keeping `popped`, and delete the commented-out debug print statements so no unused variables or dead prints remain.backend/tests/test_cache_refactor.py (1)
69-71: Ruff E701: split multiple statements on one line.Static analysis flags lines 70–71 for placing two statements on a single colon-delimited line.
Proposed fix
- threads = [threading.Thread(target=worker) for _ in range(10)] - for t in threads: t.start() - for t in threads: t.join() + threads = [threading.Thread(target=worker) for _ in range(10)] + for t in threads: + t.start() + for t in threads: + t.join()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_cache_refactor.py` around lines 69 - 71, The test currently places multiple statements on one line (threads start/join loops), violating Ruff E701; change the compact for-loop lines into standard multi-line loops so each statement is on its own line: keep the threads = [threading.Thread(target=worker) for _ in range(10)] comprehension, then replace the single-line "for t in threads: t.start()" and "for t in threads: t.join()" with two separate block-style loops that call t.start() and t.join() on their own lines to satisfy Ruff and improve readability.backend/routers/detection.py (2)
49-50: Move the import to the top of the file with other imports.The
ThreadSafeCacheimport is placed mid-file (after the module-levelrouterdeclaration). It should be grouped with the otherbackend.*imports at the top for consistency.Proposed fix
Move line 50 up to the import block (e.g., after line 41):
from backend.cache import ThreadSafeCache🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routers/detection.py` around lines 49 - 50, The import for ThreadSafeCache is placed mid-file after the module-level router declaration; move the import statement for ThreadSafeCache into the top import block alongside the other backend.* imports so all imports are grouped, ensuring ThreadSafeCache is imported before router is created and referenced (look for the ThreadSafeCache import line and the router symbol to confirm placement).
56-69: Cache stampede: concurrent requests for the same key can trigger duplicate work.Between the cache miss on line 58 and the
seton line 68,await func(...)yields control, allowing other coroutines with the same key to also miss and execute the (potentially expensive) detection function concurrently. This isn't a correctness bug — the cache remains consistent — but under load it negates the caching benefit for hot keys.A common mitigation is a per-key lock or "single-flight" pattern to coalesce concurrent calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routers/detection.py` around lines 56 - 69, The helper _get_cached_result currently allows a cache-stampede because multiple coroutines can miss the cache and concurrently call func; fix this by implementing a per-key single-flight using an asyncio-based lock map (e.g., a module-level dict _detection_locks mapping key -> asyncio.Lock or asyncio.Event) so only one coroutine executes func for a given key while others await the lock; the first waiter should check cache again after acquiring the lock, run func, set the result into _detection_cache via _detection_cache.set(key, result) (or correct arg order if needed), then release and remove the lock; ensure locks are created atomically (use dict.setdefault under a small global lock or use an asyncio.Lock to protect lock creation) and that exceptions from func still release/remove the lock so waiters don't hang.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/cache.py`:
- Around line 34-44: Remove the commented-out debug print statements inside the
get method: delete the three lines with print(f"DEBUG: get({key}) ...") so the
method no longer contains leftover debug prints; if runtime debugging is ever
needed, replace those prints with logger.debug calls (use the existing logger
and include key and current order via list(self._data.keys()) as needed). Ensure
changes are made in the get function that checks current_time - timestamp <
self._ttl and the branches that move_to_end, delete expired entries, and the
miss path.
- Around line 87-103: get_stats currently does an O(N) scan of self._data to
count expired entries (using _data and _ttl), which can be costly at large cache
sizes (e.g., user_upload_cache with _max_size=1000); change get_stats to avoid
the linear scan by either (a) removing expired_entries from the returned dict,
or (b) maintain an incremental counter (e.g., self._expired_count) updated
inside Cache methods that mutate entries (set/evict/delete/cleanup) and return
that counter in get_stats; update set, delete/evict, and any background cleanup
logic to increment/decrement this counter whenever an entry transitions to/from
expired so get_stats becomes O(1).
- Around line 62-67: The eviction code stores the result of
self._data.popitem(last=False) in an unused variable `popped` which triggers
Ruff F841; replace that assignment with a discarded result (e.g., call
popitem(...) without assigning) or assign to `_` and remove the commented debug
print lines scattered in the class (the debug prints around the set/get logic).
Update the method that manipulates self._data (the set/eviction block) to call
self._data.popitem(last=False) without keeping `popped`, and delete the
commented-out debug print statements so no unused variables or dead prints
remain.
In `@backend/routers/detection.py`:
- Around line 49-50: The import for ThreadSafeCache is placed mid-file after the
module-level router declaration; move the import statement for ThreadSafeCache
into the top import block alongside the other backend.* imports so all imports
are grouped, ensuring ThreadSafeCache is imported before router is created and
referenced (look for the ThreadSafeCache import line and the router symbol to
confirm placement).
- Around line 56-69: The helper _get_cached_result currently allows a
cache-stampede because multiple coroutines can miss the cache and concurrently
call func; fix this by implementing a per-key single-flight using an
asyncio-based lock map (e.g., a module-level dict _detection_locks mapping key
-> asyncio.Lock or asyncio.Event) so only one coroutine executes func for a
given key while others await the lock; the first waiter should check cache again
after acquiring the lock, run func, set the result into _detection_cache via
_detection_cache.set(key, result) (or correct arg order if needed), then release
and remove the lock; ensure locks are created atomically (use dict.setdefault
under a small global lock or use an asyncio.Lock to protect lock creation) and
that exceptions from func still release/remove the lock so waiters don't hang.
In `@backend/tests/test_cache_refactor.py`:
- Around line 69-71: The test currently places multiple statements on one line
(threads start/join loops), violating Ruff E701; change the compact for-loop
lines into standard multi-line loops so each statement is on its own line: keep
the threads = [threading.Thread(target=worker) for _ in range(10)]
comprehension, then replace the single-line "for t in threads: t.start()" and
"for t in threads: t.join()" with two separate block-style loops that call
t.start() and t.join() on their own lines to satisfy Ruff and improve
readability.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.jules/bolt.mdbackend/cache.pybackend/routers/detection.pybackend/tests/test_cache_refactor.py
There was a problem hiding this comment.
Pull request overview
This PR optimizes the backend caching layer by refactoring ThreadSafeCache to use an OrderedDict-backed LRU implementation, then adopting it in the detection router to replace an ad-hoc global dict cache and adding tests to validate LRU/TTL behavior.
Changes:
- Refactor
ThreadSafeCacheto O(1) LRU eviction viaOrderedDict.move_to_end()+popitem(last=False). - Replace
backend/routers/detection.py’s global dict cache withThreadSafeCache. - Add pytest coverage for LRU eviction, TTL expiration, update behavior, and concurrent writes.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
backend/cache.py |
Reworks the cache storage to OrderedDict and implements LRU eviction. |
backend/routers/detection.py |
Swaps a custom dict cache for the centralized ThreadSafeCache. |
backend/tests/test_cache_refactor.py |
Adds tests for LRU ordering, TTL behavior, and concurrency. |
.jules/bolt.md |
Documents the performance learning around O(N) eviction in hot paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Robust Cache Implementation using ThreadSafeCache (OrderedDict + LRU) | ||
| from backend.cache import ThreadSafeCache | ||
|
|
||
| CACHE_TTL = 3600 # 1 hour | ||
| MAX_CACHE_SIZE = 500 | ||
| _detection_cache = ThreadSafeCache(ttl=CACHE_TTL, max_size=MAX_CACHE_SIZE) |
There was a problem hiding this comment.
Importing ThreadSafeCache mid-module breaks the existing import grouping at the top of the file and can confuse tooling (isort/linters). Please move from backend.cache import ThreadSafeCache up with the other imports.
| def worker(): | ||
| for i in range(100): | ||
| # set(data, key) | ||
| cache.set(i, f"key-{i}") | ||
|
|
||
| threads = [threading.Thread(target=worker) for _ in range(10)] | ||
| for t in threads: t.start() | ||
| for t in threads: t.join() | ||
|
|
There was a problem hiding this comment.
test_thread_safety_concurrent_writes won't fail if worker threads raise exceptions, because exceptions in threading.Thread don't propagate to the main thread. To make this test meaningful, capture exceptions in a shared list/queue (or use concurrent.futures.ThreadPoolExecutor) and assert that no exceptions occurred.
| def worker(): | |
| for i in range(100): | |
| # set(data, key) | |
| cache.set(i, f"key-{i}") | |
| threads = [threading.Thread(target=worker) for _ in range(10)] | |
| for t in threads: t.start() | |
| for t in threads: t.join() | |
| exceptions = [] | |
| def worker(): | |
| try: | |
| for i in range(100): | |
| # set(data, key) | |
| cache.set(i, f"key-{i}") | |
| except Exception as e: | |
| exceptions.append(e) | |
| threads = [threading.Thread(target=worker) for _ in range(10)] | |
| for t in threads: | |
| t.start() | |
| for t in threads: | |
| t.join() | |
| assert not exceptions, f"Exceptions in worker threads: {exceptions}" |
| @@ -0,0 +1,73 @@ | |||
| import time | |||
There was a problem hiding this comment.
Unused import: time is imported but not used in this test module. Please remove it to keep the test file clean (and to avoid failing strict lint configurations).
| import time |
| # Evict if over capacity | ||
| if len(self._data) > self._max_size: | ||
| # Remove first item (LRU) |
There was a problem hiding this comment.
ThreadSafeCache.set() no longer cleans up expired entries before enforcing max_size. This can cause eviction of non-expired entries while expired (but recently accessed) entries remain in the cache, reducing hit ratio and making TTL semantics less predictable. Consider evicting expired entries opportunistically (e.g., pop from the LRU end while expired) before/while applying the size cap, keeping the operation O(k) for the number of expired items removed.
| # Evict if over capacity | |
| if len(self._data) > self._max_size: | |
| # Remove first item (LRU) | |
| # Opportunistically remove expired entries from LRU side | |
| while self._data: | |
| oldest_key, (old_value, old_timestamp) = next(iter(self._data.items())) | |
| if current_time - old_timestamp >= self._ttl: | |
| # Remove expired LRU entry | |
| self._data.popitem(last=False) | |
| else: | |
| break | |
| # Evict non-expired entries if still over capacity (LRU) | |
| while len(self._data) > self._max_size: |
| # Move to end (MRU) | ||
| self._data.move_to_end(key) | ||
| # print(f"DEBUG: get({key}) hit. Order: {list(self._data.keys())}") | ||
| return value | ||
| else: | ||
| # Expired entry - remove it | ||
| self._remove_key(key) | ||
| del self._data[key] | ||
| # print(f"DEBUG: get({key}) expired. Order: {list(self._data.keys())}") | ||
|
|
||
| # print(f"DEBUG: get({key}) miss. Order: {list(self._data.keys())}") |
There was a problem hiding this comment.
There are several commented-out print(...) debug statements in get()/set(). These add noise and can easily be re-enabled accidentally; prefer removing them or switching to logger.debug(...) behind the existing logger if you need trace-level introspection.
The recent release of httpx 0.28.0 removed BaseTransport which caused test failures and likely broke runtime dependencies (e.g. Starlette/FastAPI internals or GoogleTrans). This commit pins httpx<0.28.0 in both requirements.txt and requirements-render.txt to ensure stability. Also retains the ThreadSafeCache optimization for detection endpoints.
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="verify_imports.py">
<violation number="1" location="verify_imports.py:6">
P2: Using os.getcwd() makes the import path dependent on the invocation directory. If the script is run outside the repo root, backend imports will fail. Use the script’s location to add the project root instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Pins `httpx<0.28.0` to avoid breaking changes in `starlette`/`fastapi` (AttributeError: BaseTransport) and `googletrans`. Pins `firebase-admin<7.0.0` because `firebase-admin` 7.x strictly requires `httpx==0.28.1`, which conflicts with the `httpx<0.28.0` requirement needed for other packages. This ensures a compatible dependency set for deployment.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
verify_imports.py (1)
8-28: Address Ruff BLE001 (broad Exception) explicitly.Ruff flags catching
Exception. If you intend to capture any import-time failure, add a justification comment to keep lint clean.🧹 Suggested lint-friendly tweak
-except Exception as e: +except Exception as e: # noqa: BLE001 - capture any import-time failure for diagnostics🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@verify_imports.py` around lines 8 - 28, The try/except block in verify_imports.py is catching a broad Exception (except Exception as e) which triggers Ruff BLE001; either narrow the handler to specific import errors (e.g., catch ImportError and ModuleNotFoundError around the imports or in the except clause) or keep the broad catch but add an inline justification comment for Ruff (e.g., explain this script must catch any import-time failure for diagnostics and add a noqa/BLE001 comment) so the linter is satisfied; update the except block or its comment accordingly and keep the existing logging/traceback/sys.exit(1) behavior (references: the try block importing backend.routers.detection, googletrans, backend.routers.voice and backend.main and the except Exception as e handler).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@verify_imports.py`:
- Around line 5-6: Replace the non-deterministic sys.path.append(os.getcwd()) in
verify_imports.py with logic that resolves the repository/script directory
(e.g., using __file__ or Path(__file__).resolve().parent) and then insert that
resolved path at index 0 of sys.path (use sys.path.insert(0, ...)) so the
project root is deterministically located and local modules cannot be shadowed
by globally installed packages.
---
Nitpick comments:
In `@verify_imports.py`:
- Around line 8-28: The try/except block in verify_imports.py is catching a
broad Exception (except Exception as e) which triggers Ruff BLE001; either
narrow the handler to specific import errors (e.g., catch ImportError and
ModuleNotFoundError around the imports or in the except clause) or keep the
broad catch but add an inline justification comment for Ruff (e.g., explain this
script must catch any import-time failure for diagnostics and add a noqa/BLE001
comment) so the linter is satisfied; update the except block or its comment
accordingly and keep the existing logging/traceback/sys.exit(1) behavior
(references: the try block importing backend.routers.detection, googletrans,
backend.routers.voice and backend.main and the except Exception as e handler).
verify_imports.py
Outdated
| # Add project root to sys.path | ||
| sys.path.append(os.getcwd()) |
There was a problem hiding this comment.
Make project-root resolution deterministic (avoid CWD & shadowing).
os.getcwd() assumes the script is run from repo root and append can allow globally installed modules to shadow local ones. Prefer resolving the script’s directory and inserting at index 0.
🔧 Proposed fix
-import os
from pathlib import Path
@@
-# Add project root to sys.path
-sys.path.append(os.getcwd())
+# Add project root to sys.path
+project_root = Path(__file__).resolve().parent
+sys.path.insert(0, str(project_root))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@verify_imports.py` around lines 5 - 6, Replace the non-deterministic
sys.path.append(os.getcwd()) in verify_imports.py with logic that resolves the
repository/script directory (e.g., using __file__ or
Path(__file__).resolve().parent) and then insert that resolved path at index 0
of sys.path (use sys.path.insert(0, ...)) so the project root is
deterministically located and local modules cannot be shadowed by globally
installed packages.
🔍 Quality Reminder |
The deployment failed because httpx 0.28.0 removed BaseTransport, which broke starlette and googletrans. However, pinning httpx<0.28.0 caused a conflict with firebase-admin 7.1.0, which mandates httpx==0.28.1. This commit downgrades firebase-admin to <7.0.0 to resolve the diamond dependency conflict while keeping the critical httpx<0.28.0 pin.
💡 What:
ThreadSafeCacheinbackend/cache.pyto usecollections.OrderedDict.popitem(last=False)._cache_storeinbackend/routers/detection.pywithThreadSafeCache.backend/tests/test_cache_refactor.pyto verify correctness.🎯 Why:
ThreadSafeCacheimplementation used a dictionary and iterated over all keys to find the minimum access count (LFU-ish) or check expiration on everysetoperation. This is O(N) and degrades as cache size grows.detection.pyrouter used a global dictionary without thread locks for eviction, leading to potential race conditions (crash) during concurrent requests when the cache is full.📊 Impact:
🔬 Measurement:
backend/tests/test_cache_refactor.pywhich tests LRU eviction order and TTL expiration.detection.pyis preserved but now safer and faster.PR created automatically by Jules for task 5806631139876488847 started by @RohanExploit
Summary by cubic
Optimized ThreadSafeCache with O(1) LRU eviction and adopted it in detection routes to speed up caching and eliminate race conditions. Pinned httpx<0.28.0 and firebase-admin<7.0.0 to resolve a dependency conflict and prevent crashes.
Refactors
Dependencies
Written for commit cc742f6. Summary will update on new commits.
Summary by CodeRabbit
Bug Fixes
Tests
Chores
Documentation
Tools