⚡ Bolt: Optimize issue endpoints with response caching#475
⚡ Bolt: Optimize issue endpoints with response caching#475RohanExploit wants to merge 1 commit intomainfrom
Conversation
- Implements JSON response caching for `get_recent_issues` and `get_nearby_issues` - Bypasses Pydantic validation and serialization on cache hits - Updates cache keys to `v2_` to avoid conflicts with legacy data - Consistent ISO 8601 date serialization for cache hits and misses - Verified with new mocked test `backend/tests/test_issues_cache_mocked.py`
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughThis PR introduces v2 caching for recent and nearby issues endpoints by storing pre-serialized JSON strings in cache instead of Python objects, reducing serialization overhead on subsequent cache hits and ensuring consistent response typing. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the get_recent_issues and get_nearby_issues endpoints by caching pre-serialized JSON strings instead of Python objects. This reduces CPU usage on cache hits by eliminating redundant FastAPI JSONResponse serialization, Pydantic validation, and jsonable_encoder overhead.
Changes:
- Modified
/api/issues/recentand/api/issues/nearbyendpoints to cache JSON strings and return raw Response objects - Added cache key versioning with
v2_prefix to avoid conflicts with legacy cached data - Added comprehensive test suite to verify cache miss/hit behavior with mocked database and cache
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| backend/routers/issues.py | Updated get_nearby_issues and get_recent_issues to cache serialized JSON strings and return Response objects directly |
| backend/tests/test_issues_cache_mocked.py | Added new test suite with mocked database and cache to verify caching behavior works correctly for both endpoints |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| recent_issues_cache.set(data, cache_key) | ||
| return data | ||
| # Optimize: Cache serialized JSON string to avoid serialization overhead on subsequent requests | ||
| json_content = json.dumps(data, default=str) |
There was a problem hiding this comment.
The default=str fallback in json.dumps is too broad and could mask serialization issues. Since created_at is already converted to ISO format string on line 700, the default=str won't be called for datetimes. However, for other potentially non-JSON-serializable types, this will silently convert them to strings using their str method, which may not produce valid JSON representations.
Consider being more explicit about which types need custom serialization, or verify that all data is already JSON-serializable before calling json.dumps. Pydantic's model_dump(mode='json') (used in get_nearby_issues) is a more robust approach as it handles all Pydantic types consistently.
| json_content = json.dumps(data, default=str) | |
| json_content = json.dumps(data) |
| import json | ||
| from datetime import datetime, timezone | ||
|
|
||
| from fastapi import Response |
There was a problem hiding this comment.
The Response import on line 15 can be consolidated with the other FastAPI imports on line 2 for better organization. This would make:
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form, Query, Request, BackgroundTasks, status, Response
| @router.get("/api/issues/nearby", response_model=List[NearbyIssueResponse]) | ||
| def get_nearby_issues( | ||
| latitude: float = Query(..., ge=-90, le=90, description="Latitude of the location"), | ||
| longitude: float = Query(..., ge=-180, le=180, description="Longitude of the location"), | ||
| radius: float = Query(50.0, ge=10, le=500, description="Search radius in meters"), | ||
| limit: int = Query(10, ge=1, le=50, description="Maximum number of results"), | ||
| db: Session = Depends(get_db) | ||
| ): | ||
| """ | ||
| Get issues near a specific location for deduplication purposes. | ||
| Returns issues within the specified radius, sorted by distance. | ||
| """ | ||
| try: | ||
| # Check cache first | ||
| cache_key = f"{latitude:.5f}_{longitude:.5f}_{radius}_{limit}" | ||
| cached_data = nearby_issues_cache.get(cache_key) | ||
| if cached_data: | ||
| return cached_data | ||
| # v2: cache JSON string to avoid conflicts with v1 list data and ensure safe typing | ||
| cache_key = f"v2_nearby_{latitude:.5f}_{longitude:.5f}_{radius}_{limit}" | ||
| cached_json = nearby_issues_cache.get(cache_key) | ||
| if cached_json: | ||
| return Response(content=cached_json, media_type="application/json") |
There was a problem hiding this comment.
When returning a raw Response object with pre-serialized JSON content, FastAPI bypasses the response_model validation and serialization. The response_model decorator on line 288 will still generate the correct OpenAPI schema, but the actual response won't be validated against it.
This is intentional for caching optimization, but it means that if the manually constructed data structures (nearby_responses list at lines 335-348) don't match the NearbyIssueResponse schema, there will be no validation error. Consider adding a comment noting that the response structure must match NearbyIssueResponse schema, or periodically validate the structure in tests.
| @router.get("/api/issues/recent", response_model=List[IssueSummaryResponse]) | ||
| def get_recent_issues( | ||
| limit: int = Query(10, ge=1, le=50, description="Number of issues to return"), | ||
| offset: int = Query(0, ge=0, description="Number of issues to skip"), | ||
| db: Session = Depends(get_db) | ||
| ): | ||
| cache_key = f"recent_issues_{limit}_{offset}" | ||
| cached_data = recent_issues_cache.get(cache_key) | ||
| if cached_data: | ||
| return JSONResponse(content=cached_data) | ||
| # v2: cache JSON string to avoid conflicts with v1 list data and ensure safe typing | ||
| cache_key = f"v2_recent_issues_{limit}_{offset}" | ||
| cached_json = recent_issues_cache.get(cache_key) | ||
| if cached_json: | ||
| return Response(content=cached_json, media_type="application/json") |
There was a problem hiding this comment.
When returning a raw Response object with pre-serialized JSON content, FastAPI bypasses the response_model validation and serialization. The response_model decorator on line 662 will still generate the correct OpenAPI schema, but the actual response won't be validated against it.
This is intentional for caching optimization, but it means that if the manually constructed data dictionaries (lines 696-707) don't match the IssueSummaryResponse schema, there will be no validation error. Consider adding a comment noting that the response structure must match IssueSummaryResponse schema, or periodically validate the structure in tests.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
backend/tests/test_issues_cache_mocked.py (1)
84-88: Strengthen cache assertions to lock in the v2 contract.These tests already capture cached payloads; also assert key prefix and serialized payload type to prevent regressions in cache format/keying.
🧪 Suggested assertions
assert mock_recent.set.called args, _ = mock_recent.set.call_args cached_content = args[0] + cache_key = args[1] + assert isinstance(cached_content, str) + assert cache_key.startswith("v2_recent_issues_")assert mock_nearby.set.called args, _ = mock_nearby.set.call_args cached_content = args[0] + cache_key = args[1] + assert isinstance(cached_content, str) + assert cache_key.startswith("v2_nearby_")Also applies to: 128-132
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_issues_cache_mocked.py` around lines 84 - 88, The test captures the mocked cache set but doesn't assert the cache key format or payload serialization; update the assertions around mock_recent.set.call_args (used in tests in backend/tests/test_issues_cache_mocked.py) to unpack both key and value (e.g., key, payload = mock_recent.set.call_args[0]), assert the key uses the expected prefix (e.g., key.startswith("recent:") or the project's configured prefix), and assert the payload is the serialized type expected by v2 (e.g., isinstance(payload, str) or bytes) and that it deserializes to the expected structure (e.g., json.loads(payload) yields a dict/list). Apply the same stronger assertions for the other occurrence around lines 128-132.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/routers/issues.py`:
- Around line 351-355: The nearby cache is being written as JSON
(nearby_issues_cache.set) but cache invalidation is only partial
(recent_issues_cache.clear is called in one path), leaving stale data after
mutations; update the mutating handlers (upvote_issue, verify_issue_endpoint,
update_issue_status and any other commit paths that change issues) to call the
shared _invalidate_issue_caches() immediately after the successful DB
commit/transaction so both nearby_issues_cache and recent_issues_cache are
cleared consistently; locate where nearby_issues_cache.set/json_content is used
and ensure the same _invalidate_issue_caches() is invoked in those write
handlers' success branches to keep reads consistent.
In `@backend/tests/test_issues_cache_mocked.py`:
- Line 28: The one-line class definition "class MockTensor: pass" violates the
Ruff E701 rule; replace it with a multi-line definition for MockTensor (e.g.,
define class MockTensor: followed by a newline with a pass statement indented)
so the class body is on its own line and conforms to the linter.
- Around line 9-31: The test file mutates process-global sys.modules entries
(e.g., assignments to sys.modules['magic'], sys.modules['telegram'],
mock_torch/simulated 'torch' module and MockTensor) at import time and never
restores originals; change this to use a test-scoped fixture or monkeypatch to
set and restore modules (for example, use pytest's monkeypatch.setitem for each
key or a fixture that saves originals and restores them in teardown) and ensure
the temporary mock_torch and MockTensor are injected via
monkeypatch.setitem('sys.modules', 'torch', mock_torch) (or equivalent) so no
global leakage occurs after tests complete.
---
Nitpick comments:
In `@backend/tests/test_issues_cache_mocked.py`:
- Around line 84-88: The test captures the mocked cache set but doesn't assert
the cache key format or payload serialization; update the assertions around
mock_recent.set.call_args (used in tests in
backend/tests/test_issues_cache_mocked.py) to unpack both key and value (e.g.,
key, payload = mock_recent.set.call_args[0]), assert the key uses the expected
prefix (e.g., key.startswith("recent:") or the project's configured prefix), and
assert the payload is the serialized type expected by v2 (e.g.,
isinstance(payload, str) or bytes) and that it deserializes to the expected
structure (e.g., json.loads(payload) yields a dict/list). Apply the same
stronger assertions for the other occurrence around lines 128-132.
💡 What:
Optimized
get_recent_issuesandget_nearby_issuesendpoints inbackend/routers/issues.pyto cache the serialized JSON response string instead of the raw Python objects (List of Dicts/Models).🎯 Why:
Previously, cache hits still incurred the cost of FastAPI's
JSONResponseserialization (iterating over lists, serializing Pydantic models/dicts to JSON). For high-traffic endpoints like the feed and nearby issues map, this redundant serialization consumed unnecessary CPU cycles.📊 Impact:
jsonable_encoderoverhead on cache hits.v2_to prevent data type conflicts with any existing cache entries.🔬 Measurement:
Verified with a new test suite
backend/tests/test_issues_cache_mocked.pywhich mocks the database and cache to ensure:PR created automatically by Jules for task 10758030150171408291 started by @RohanExploit
Summary by cubic
Optimized the recent and nearby issue endpoints to cache pre-serialized JSON and serve it directly, cutting CPU use on cache hits and keeping date formats consistent. Cache keys are versioned with v2_ to avoid conflicts with older entries.
Written for commit fb69615. Summary will update on new commits.
Summary by CodeRabbit
Refactor
Tests