Enhance Civic Intelligence Engine with Delta Tracking and Spike Detection#482
Enhance Civic Intelligence Engine with Delta Tracking and Spike Detection#482RohanExploit wants to merge 4 commits intomainfrom
Conversation
- Enhanced `CivicIntelligenceEngine` to calculate index delta from previous day. - Refined `Top Emerging Concern` logic to prioritize category spikes (>50% increase) over raw volume. - Updated `CIVIC_INTELLIGENCE.md` documentation. - Added comprehensive unit tests in `backend/tests/test_civic_intelligence_delta.py`.
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughAdds spike-detection to trend analysis, records the top spike category, and computes a daily civic index delta by comparing today's score to the previous snapshot; the daily snapshot schema and outputs now include score_delta and top_spike_category/top_emerging_concern. Changes
Sequence DiagramsequenceDiagram
participant Engine as CivicIntelligenceEngine
participant DB as Database
participant TrendAnalyzer as TrendAnalyzer
participant Snapshot as SnapshotStorage
Engine->>DB: Query issues, audits, grievances (24h)
DB-->>Engine: Return current data
Engine->>TrendAnalyzer: Analyze trends (distributions, spikes)
TrendAnalyzer-->>Engine: Return category_distribution & spikes
Engine->>Engine: Compute per-category increases<br/>(handle prev_count==0 as surge)
Engine->>Engine: Select top_spike_category or fallback by volume
Engine->>DB: Retrieve previous_snapshot (if exists)
DB-->>Engine: Return previous_snapshot
Engine->>Engine: Calculate civic index score
Engine->>Engine: Compute score_delta = score - previous_score (or 0)
Engine->>Snapshot: Persist snapshot with score, score_delta, top_spike_category/top_emerging_concern
Snapshot-->>Engine: Snapshot persisted
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
backend/civic_intelligence.py (1)
82-88:top_spike_categorymay be set for non-spike categories.The tracking at lines 83-85 updates
top_spike_categoryfor any category with a positive increase, even if it doesn't meet the spike criteria (>50% increase AND >5 volume). This meanstop_spike_categorycould reference a category that isn't actually in thespikeslist.If the intent is to only select from actual spikes, the tracking should be conditional:
♻️ Proposed fix to only track actual spikes
- # Track the highest spike for "Emerging Concern" - if increase > max_spike_increase: - max_spike_increase = increase - top_spike_category = category + # Track the highest spike for "Emerging Concern" (only if it qualifies as a spike) + if category in spikes or (prev_count == 0 and count > 5): + if increase > max_spike_increase: + max_spike_increase = increase + top_spike_category = categoryAlternatively, if the current behavior is intentional (highest increase regardless of spike status), consider updating the documentation to clarify this.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/civic_intelligence.py` around lines 82 - 88, The code updates max_spike_increase and top_spike_category for any positive increase even when the category doesn't meet the spike criteria; change the update logic in the block that computes spikes (using symbols spikes, increase, category, max_spike_increase, top_spike_category) so you only assign max_spike_increase and top_spike_category if the current category passes the spike test (e.g., increase > 0.5 AND previous_volume > 5, or whatever exact spike condition is used to append to spikes), otherwise leave top_spike_category unchanged (or set to None if no actual spikes found).backend/tests/test_civic_intelligence_delta.py (1)
163-164: Misleading test comment.The comment states "no spike detection base, so fallback to max volume", but this isn't accurate. When there's no previous snapshot:
previous_distis{}prev_countfor "Water" is 0- Since
prev_count == 0andcount > 5, "Water" is added tospikeswithincrease = float('inf')(lines 78-80 in civic_intelligence.py)top_spike_categoryis set to "Water" via spike tracking, not the fallback logicThe assertion is correct, but the comment should reflect the actual code path.
📝 Suggested comment fix
- # Since no previous snapshot, no spike detection base, so fallback to max volume + # With no previous snapshot, "Water" is detected as a new surge (count > 5, prev_count == 0) + # and becomes top_spike_category with infinite increase assert index_data['top_emerging_concern'] == "Water"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_civic_intelligence_delta.py` around lines 163 - 164, Update the misleading test comment above the assertion that "Water" is top_emerging_concern to reflect the actual spike-detection path: explain that with an empty previous_dist (prev_count == 0) the code in civic_intelligence.py adds "Water" to spikes with increase = float('inf') (the logic around prev_count handling and lines where increase is set to inf), so top_spike_category becomes "Water" via spike tracking rather than via a fallback-to-max-volume branch; keep the assertion unchanged but replace the comment to describe this spike-based determination.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/civic_intelligence.py`:
- Around line 82-88: The code updates max_spike_increase and top_spike_category
for any positive increase even when the category doesn't meet the spike
criteria; change the update logic in the block that computes spikes (using
symbols spikes, increase, category, max_spike_increase, top_spike_category) so
you only assign max_spike_increase and top_spike_category if the current
category passes the spike test (e.g., increase > 0.5 AND previous_volume > 5, or
whatever exact spike condition is used to append to spikes), otherwise leave
top_spike_category unchanged (or set to None if no actual spikes found).
In `@backend/tests/test_civic_intelligence_delta.py`:
- Around line 163-164: Update the misleading test comment above the assertion
that "Water" is top_emerging_concern to reflect the actual spike-detection path:
explain that with an empty previous_dist (prev_count == 0) the code in
civic_intelligence.py adds "Water" to spikes with increase = float('inf') (the
logic around prev_count handling and lines where increase is set to inf), so
top_spike_category becomes "Water" via spike tracking rather than via a
fallback-to-max-volume branch; keep the assertion unchanged but replace the
comment to describe this spike-based determination.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CIVIC_INTELLIGENCE.mdbackend/civic_intelligence.pybackend/tests/test_civic_intelligence_delta.py
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/civic_intelligence.py">
<violation number="1" location="backend/civic_intelligence.py:83">
P2: Top emerging concern is selected from any positive increase, even if it fails the spike criteria (count <= 5 or increase <= 0.5). This can incorrectly highlight low-volume/non-spike categories. Restrict the max-spike tracking to the same spike criteria.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| spikes.append(category) # New surge | ||
|
|
||
| # Track the highest spike for "Emerging Concern" | ||
| if increase > max_spike_increase: |
There was a problem hiding this comment.
P2: Top emerging concern is selected from any positive increase, even if it fails the spike criteria (count <= 5 or increase <= 0.5). This can incorrectly highlight low-volume/non-spike categories. Restrict the max-spike tracking to the same spike criteria.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/civic_intelligence.py, line 83:
<comment>Top emerging concern is selected from any positive increase, even if it fails the spike criteria (count <= 5 or increase <= 0.5). This can incorrectly highlight low-volume/non-spike categories. Restrict the max-spike tracking to the same spike criteria.</comment>
<file context>
@@ -63,17 +63,29 @@ def run_daily_cycle(self):
spikes.append(category) # New surge
+ # Track the highest spike for "Emerging Concern"
+ if increase > max_spike_increase:
+ max_spike_increase = increase
+ top_spike_category = category
</file context>
| if increase > max_spike_increase: | |
| if count > 5 and increase > 0.5 and increase > max_spike_increase: |
There was a problem hiding this comment.
Pull request overview
This PR enhances the Civic Intelligence Engine by adding day-over-day delta tracking and intelligent spike detection for emerging civic concerns. The system now calculates the daily change in the Civic Intelligence Index score and identifies categories with sudden volume increases rather than just selecting the highest-volume category.
Changes:
- Added delta calculation to track civic intelligence score changes between daily snapshots
- Implemented percentage-based spike detection to identify emerging concerns with >50% increases and volume >5
- Enhanced test coverage with new delta-specific test cases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| backend/civic_intelligence.py | Added spike detection loop (lines 65-88) and delta calculation (lines 217-223); modified _calculate_index to accept previous_snapshot parameter |
| backend/tests/test_civic_intelligence_delta.py | New test file with two comprehensive tests covering delta calculation with and without previous snapshots |
| CIVIC_INTELLIGENCE.md | Updated documentation to describe spike prioritization, delta calculation, and new output fields |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| increase = float('inf') # Infinite increase | ||
| spikes.append(category) # New surge |
There was a problem hiding this comment.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with the code block at the same level (lines 76-77 and line 79).
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge | |
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge |
|
|
||
| score_delta = 0.0 | ||
| if previous_score is not None: | ||
| score_delta = round(score - previous_score, 1) |
There was a problem hiding this comment.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with line 222 at the same indentation level.
| score_delta = round(score - previous_score, 1) | |
| score_delta = round(score - previous_score, 1) |
| spikes = [] | ||
| max_spike_increase = 0.0 | ||
| top_spike_category = None | ||
|
|
||
| for category, count in current_dist.items(): | ||
| prev_count = previous_dist.get(category, 0) | ||
| # Spike definition: > 50% increase AND significant volume (> 5) | ||
| if prev_count > 0 and count > 5: | ||
| increase = 0.0 | ||
|
|
||
| if prev_count > 0: | ||
| increase = (count - prev_count) / prev_count | ||
| if increase > 0.5: | ||
| # Spike definition: > 50% increase AND significant volume (> 5) | ||
| if count > 5 and increase > 0.5: | ||
| spikes.append(category) | ||
| elif prev_count == 0 and count > 5: | ||
| increase = float('inf') # Infinite increase | ||
| spikes.append(category) # New surge | ||
|
|
||
| # Track the highest spike for "Emerging Concern" | ||
| if increase > max_spike_increase: | ||
| max_spike_increase = increase | ||
| top_spike_category = category | ||
|
|
||
| trends['spikes'] = spikes | ||
| trends['top_spike_category'] = top_spike_category |
There was a problem hiding this comment.
When current_dist is empty (no categories), the spike detection loop never executes, leaving top_spike_category as None. This is handled correctly by the fallback logic in _calculate_index (lines 229-234), but consider adding a test case to verify this scenario works correctly when both previous and current distributions are empty.
| increase = float('inf') # Infinite increase | ||
| spikes.append(category) # New surge |
There was a problem hiding this comment.
Setting increase to float('inf') for new categories (prev_count == 0 and count > 5) means they will always be selected as top_spike_category over any existing category increases, regardless of how dramatic those increases are. This may not reflect the intended prioritization. Consider using a large but finite value (e.g., count itself or a multiplier like count * 10) to allow comparison with dramatic increases in existing categories, or document this as intentional behavior if new categories should always take priority.
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge | |
| # New category with significant volume; treat as a strong spike | |
| increase = float(count) | |
| spikes.append(category) # New surge |
| # Check Emerging Concern | ||
| # Fire increased from 2 to 10 (>50% and >5 items) -> Should be top spike | ||
| assert index_data['top_emerging_concern'] == "Fire" |
There was a problem hiding this comment.
The test doesn't cover the scenario where a new category appears with significant volume (prev_count == 0 and count > 5), which triggers the infinite increase logic. Consider adding a test case where the previous snapshot has {"Fire": 2} and the current has {"Fire": 10, "NewCategory": 8} to verify that the new category with infinite increase is correctly prioritized (or not, depending on intended behavior).
| # Track the highest spike for "Emerging Concern" | ||
| if increase > max_spike_increase: | ||
| max_spike_increase = increase | ||
| top_spike_category = category |
There was a problem hiding this comment.
The logic for tracking the highest spike for "Emerging Concern" includes all categories, not just those that qualify as spikes. This means a category with a small count (e.g., count <= 5) could be selected as the top spike even though it doesn't meet the spike criteria. For example, if a category goes from 1 to 2 (100% increase), it would have a higher increase than a category going from 10 to 20 (100% vs a proper spike), but wouldn't qualify as a spike due to low volume. Consider only tracking max_spike_increase and top_spike_category when the category actually qualifies as a spike (either lines 77 or 80).
| # Check Emerging Concern | ||
| # Fire increased from 2 to 10 (>50% and >5 items) -> Should be top spike | ||
| assert index_data['top_emerging_concern'] == "Fire" |
There was a problem hiding this comment.
The test doesn't cover the edge case where categories have percentage increases but don't meet the volume threshold (count <= 5). Consider adding a test case where a category has a high percentage increase but low volume (e.g., going from 1 to 3) alongside a category with a valid spike (e.g., from 5 to 10) to ensure top_emerging_concern correctly prioritizes qualified spikes over unqualified increases.
| increase = float('inf') # Infinite increase | ||
| spikes.append(category) # New surge |
There was a problem hiding this comment.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with the code block at the same level (lines 76-77).
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge | |
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge |
- Added `scipy` and `joblib` to `backend/requirements-render.txt`. - These are required dependencies for `scikit-learn`'s DBSCAN implementation used in `TrendAnalyzer`. - Prevents `ImportError` or runtime failures in the production environment.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/requirements-render.txt (1)
24-25: Consider adding version constraints for scipy and joblib.While scipy and joblib are typically installed automatically as transitive dependencies of scikit-learn, the explicit listing indicates your deployment environment required them declared directly (likely due to dependency resolution constraints or a non-standard installation method). That's a valid pragmatic solution.
However, explicitly adding dependencies without version constraints introduces reproducibility and compatibility risks. Since scikit-learn also lacks a version pin, consider pinning compatible versions:
scipy>=1.10.0,<2.0.0 joblib>=1.3.0,<2.0.0This ensures consistent behavior across deployments while maintaining flexibility for scikit-learn updates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/requirements-render.txt` around lines 24 - 25, Update the explicit package pins for scipy and joblib in requirements-render.txt to include compatible version constraints (e.g., scipy>=1.10.0,<2.0.0 and joblib>=1.3.0,<2.0.0) so installations are reproducible and avoid unexpected breaks from major releases; modify the existing scipy and joblib entries to the pinned ranges and verify they remain compatible with your scikit-learn version by running a quick install/test in the deployment environment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/requirements-render.txt`:
- Around line 24-25: Update the explicit package pins for scipy and joblib in
requirements-render.txt to include compatible version constraints (e.g.,
scipy>=1.10.0,<2.0.0 and joblib>=1.3.0,<2.0.0) so installations are reproducible
and avoid unexpected breaks from major releases; modify the existing scipy and
joblib entries to the pinned ranges and verify they remain compatible with your
scikit-learn version by running a quick install/test in the deployment
environment.
- Updated `backend/requirements-render.txt`: Removed explicit `numpy`, `scipy`, `joblib` (relying on `scikit-learn` wheel resolution) to simplify dependency graph. - Updated `render-build.sh`: Added `--no-cache-dir` to `pip install` to prevent OOM/disk errors on Render free tier. - This addresses the deployment failure caused by heavy dependency installation.
🔍 Quality Reminder |
- Updated `render.yaml` to use `./render-build.sh` as the build command. - This ensures the optimized `pip install --no-cache-dir` command is actually executed, preventing memory issues during build. - Made `render-build.sh` executable. - Verified `requirements-render.txt` is optimized.
Implemented the "Daily Civic Intelligence Refinement Engine" improvements as requested. This includes calculating the day-over-day change in the Civic Intelligence Index and smarter detection of emerging concerns based on sudden spikes in category volume rather than just total count.
Changes:
backend/civic_intelligence.py:score_delta.CIVIC_INTELLIGENCE.mdto reflect the new algorithm and data points.backend/tests/test_civic_intelligence_delta.pyto test the new logic with mocks.The system now provides more actionable daily insights by highlighting sudden trends and tracking overall performance improvement/decline.
PR created automatically by Jules for task 4048902690144677027 started by @RohanExploit
Summary by cubic
Adds day-over-day Civic Intelligence score delta and spike-based emerging concern detection to surface sudden trends. Fixes Render deploys by using a no-cache build script and pointing render.yaml at it.
New Features
Dependencies
Written for commit 7f39099. Summary will update on new commits.
Summary by CodeRabbit
New Features
Tests
Documentation