Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,7 @@
## 2026-02-08 - Return Type Consistency in Utilities
**Learning:** Inconsistent return types in shared utility functions (like `process_uploaded_image`) can cause runtime crashes across multiple modules, especially when some expect tuples and others expect single values. This can lead to deployment failures that are hard to debug without full integration logs.
**Action:** Always maintain strict return type consistency for core utilities. Use type hints and verify all call sites when changing a function's signature. Ensure that performance-oriented optimizations (like returning multiple processed formats) are applied uniformly.

## 2026-05-24 - Unbounded Spatial Queries
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the learning-entry date to keep chronology accurate.

Line 41 uses 2026-05-24, which is later than this PR date (2026-02-26). This makes the incident/learning timeline misleading.

🛠️ Suggested correction
-## 2026-05-24 - Unbounded Spatial Queries
+## 2026-02-26 - Unbounded Spatial Queries
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## 2026-05-24 - Unbounded Spatial Queries
## 2026-02-26 - Unbounded Spatial Queries
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md at line 41, The header "## 2026-05-24 - Unbounded Spatial
Queries" in .jules/bolt.md has a future date that breaks chronology; update that
date on the "Unbounded Spatial Queries" entry to a date on or before the PR date
(for example 2026-02-26 or an earlier appropriate date) so the learning-entry
chronology is accurate.

**Learning:** Spatial bounding box queries without a `LIMIT` clause can cause severe performance degradation in dense areas or when using large search radii. The application attempts to load and process all matching records in Python, leading to O(N) memory usage and processing time.
**Action:** Always apply a safety `.limit()` to spatial candidate queries to prevent worst-case scenarios, even if the primary filter is a bounding box. A reasonable limit (e.g., 100) balances result completeness with system stability.
6 changes: 4 additions & 2 deletions backend/routers/issues.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ async def create_issue(
min_lat, max_lat, min_lon, max_lon = get_bounding_box(latitude, longitude, 50.0)

# Performance Boost: Use column projection to avoid loading full model instances
# Optimization: Limit to 100 to prevent loading too many issues in dense areas
open_issues = await run_in_threadpool(
Comment on lines +101 to 102
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

wc -l backend/routers/issues.py

Repository: RohanExploit/VishwaGuru

Length of output: 96


🏁 Script executed:

cat -n backend/routers/issues.py | sed -n '95,110p'

Repository: RohanExploit/VishwaGuru

Length of output: 935


🏁 Script executed:

cat -n backend/routers/issues.py | sed -n '110,130p'

Repository: RohanExploit/VishwaGuru

Length of output: 1093


🏁 Script executed:

cat -n backend/routers/issues.py | sed -n '305,335p'

Repository: RohanExploit/VishwaGuru

Length of output: 1410


🏁 Script executed:

python - <<'PY'
from pathlib import Path
import re

p = Path("backend/routers/issues.py")
text = p.read_text()

pattern = re.compile(r"db\.query\([\s\S]{0,1500}?\.limit\(100\)\.all\(\)")
matches = list(pattern.finditer(text))

if not matches:
    print("No .limit(100).all() blocks found.")
    raise SystemExit(0)

for i, m in enumerate(matches, 1):
    block = text[m.start():m.end()]
    print(f"\n--- block {i} ---")
    print("contains order_by:", ".order_by(" in block)
    lines = block.splitlines()
    for ln in lines[:5]:
        print(ln)
    print("...")
    for ln in lines[-5:]:
        print(ln)
PY

Repository: RohanExploit/VishwaGuru

Length of output: 836


🏁 Script executed:

# Check if there's any ORDER BY clause related to distance in the function
grep -n "order_by\|ORDER BY" backend/routers/issues.py

Repository: RohanExploit/VishwaGuru

Length of output: 398


🏁 Script executed:

# Check the find_nearby_issues function to understand what it does
sed -n '1,706p' backend/routers/issues.py | grep -n "def find_nearby_issues" -A 20

Repository: RohanExploit/VishwaGuru

Length of output: 49


🏁 Script executed:

# Find the find_nearby_issues function definition and implementation
rg "def find_nearby_issues" -A 30 backend/routers/issues.py

Repository: RohanExploit/VishwaGuru

Length of output: 49


🏁 Script executed:

# Check imports and overall structure
head -50 backend/routers/issues.py

Repository: RohanExploit/VishwaGuru

Length of output: 2133


🏁 Script executed:

# Check the find_nearby_issues function in spatial_utils
cat backend/spatial_utils.py

Repository: RohanExploit/VishwaGuru

Length of output: 7470


🏁 Script executed:

# Also check if there's any sorting/ordering happening after the query results are fetched
rg "find_nearby_issues" -B 2 -A 10 backend/routers/issues.py | head -60

Repository: RohanExploit/VishwaGuru

Length of output: 1622


Apply ORDER BY approximate distance before .limit(100) in spatial queries to avoid excluding true nearest issues.

The bounding-box candidate queries at lines 118 and 327 apply .limit(100) without deterministic distance ordering. In dense areas with >100 issues within the bounding box, this truncates arbitrary rows before find_nearby_issues() can filter and sort by true distance, potentially excluding the closest issues and breaking deduplication correctness.

The suggested fix (order candidates by squared Euclidean distance before limit) is valid and should be applied to both locations.

✅ Suggested fix (order candidates by approximate distance before LIMIT)
@@
 logger = logging.getLogger(__name__)
 
 router = APIRouter()
+SPATIAL_CANDIDATE_LIMIT = 100
@@
-            open_issues = await run_in_threadpool(
+            distance_order_expr = (
+                (Issue.latitude - latitude) * (Issue.latitude - latitude) +
+                (Issue.longitude - longitude) * (Issue.longitude - longitude)
+            )
+            open_issues = await run_in_threadpool(
                 lambda: db.query(
@@
                 ).filter(
                     Issue.status == "open",
                     Issue.latitude >= min_lat,
                     Issue.latitude <= max_lat,
                     Issue.longitude >= min_lon,
                     Issue.longitude <= max_lon
-                ).limit(100).all()
+                ).order_by(distance_order_expr).limit(SPATIAL_CANDIDATE_LIMIT).all()
             )
@@
-        open_issues = db.query(
+        distance_order_expr = (
+            (Issue.latitude - latitude) * (Issue.latitude - latitude) +
+            (Issue.longitude - longitude) * (Issue.longitude - longitude)
+        )
+        open_issues = db.query(
@@
         ).filter(
             Issue.status == "open",
             Issue.latitude >= min_lat,
             Issue.latitude <= max_lat,
             Issue.longitude >= min_lon,
             Issue.longitude <= max_lon
-        ).limit(100).all()
+        ).order_by(distance_order_expr).limit(SPATIAL_CANDIDATE_LIMIT).all()

Also applies to: 118-118, 311-312, 327-327

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/issues.py` around lines 101 - 102, The bounding-box candidate
queries that assign open_issues and the other candidate sets are applying
.limit(100) before ordering by distance, which can drop true nearest issues;
update the queries that build the bounding-box candidates (the ones feeding
find_nearby_issues and assigning open_issues) to compute an approximate squared
Euclidean distance (e.g., (latitude - center_lat)^2 + (longitude -
center_lon)^2) in the SELECT/WHERE expression and add an ORDER BY on that
approximate distance before calling .limit(100), so the top 100 are the closest
candidates; ensure the same change is applied to both query sites referenced
(the query that sets open_issues and the other bounding-box candidate query used
before find_nearby_issues).

lambda: db.query(
Issue.id,
Expand All @@ -114,7 +115,7 @@ async def create_issue(
Issue.latitude <= max_lat,
Issue.longitude >= min_lon,
Issue.longitude <= max_lon
).all()
).limit(100).all()
Comment on lines 100 to +118
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a hard LIMIT without an ORDER BY makes the candidate set nondeterministic. In dense areas this can exclude the closest in-radius issues from find_nearby_issues, causing the deduplication flow to miss/link the wrong issue. Consider adding a deterministic order_by that approximates distance to (latitude, longitude) before applying the LIMIT (e.g., squared lat/lon delta), so the top N candidates are the most relevant.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new limit is applied before any distance ordering, so in dense areas the closest issues may be excluded. Order by proximity before limiting to keep nearby results correct.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 118:

<comment>The new limit is applied before any distance ordering, so in dense areas the closest issues may be excluded. Order by proximity before limiting to keep nearby results correct.</comment>

<file context>
@@ -114,7 +115,7 @@ async def create_issue(
                     Issue.longitude >= min_lon,
                     Issue.longitude <= max_lon
-                ).all()
+                ).limit(100).all()
             )
 
</file context>
Fix with Cubic

)

nearby_issues_with_distance = find_nearby_issues(
Expand Down Expand Up @@ -307,6 +308,7 @@ def get_nearby_issues(
min_lat, max_lat, min_lon, max_lon = get_bounding_box(latitude, longitude, radius)

# Performance Boost: Use column projection to avoid loading full model instances
# Optimization: Limit to 100 to prevent loading too many issues in dense areas
open_issues = db.query(
Issue.id,
Issue.description,
Expand All @@ -322,7 +324,7 @@ def get_nearby_issues(
Issue.latitude <= max_lat,
Issue.longitude >= min_lon,
Issue.longitude <= max_lon
).all()
).limit(100).all()
Comment on lines 310 to +327
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limit(100) is applied before computing and sorting by actual distance, and the query has no ORDER BY. This can lead to /api/issues/nearby returning results that are not the true closest issues (or returning none) when there are >100 candidates in the bounding box. To preserve the endpoint contract (“sorted by distance”), order candidates by an approximate distance expression in SQL before limiting, or otherwise ensure the limited set still contains the nearest neighbors.

Copilot uses AI. Check for mistakes.

nearby_issues_with_distance = find_nearby_issues(
open_issues, latitude, longitude, radius_meters=radius
Expand Down
2 changes: 1 addition & 1 deletion render.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ services:
name: vishwaguru-backend
property: port
- key: PYTHONPATH
value: backend
value: .
# Required API Keys (must be set in Render dashboard)
- key: GEMINI_API_KEY
sync: false
Expand Down