-
Notifications
You must be signed in to change notification settings - Fork 36
Enhance Civic Intelligence Engine with Delta Tracking and Spike Detection #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
78d1d80
7cb7377
69f82db
7f39099
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -63,17 +63,29 @@ def run_daily_cycle(self): | |||||||||||||||||||
| current_dist = trends.get('category_distribution', {}) | ||||||||||||||||||||
|
|
||||||||||||||||||||
| spikes = [] | ||||||||||||||||||||
| max_spike_increase = 0.0 | ||||||||||||||||||||
| top_spike_category = None | ||||||||||||||||||||
|
|
||||||||||||||||||||
| for category, count in current_dist.items(): | ||||||||||||||||||||
| prev_count = previous_dist.get(category, 0) | ||||||||||||||||||||
| # Spike definition: > 50% increase AND significant volume (> 5) | ||||||||||||||||||||
| if prev_count > 0 and count > 5: | ||||||||||||||||||||
| increase = 0.0 | ||||||||||||||||||||
|
|
||||||||||||||||||||
| if prev_count > 0: | ||||||||||||||||||||
| increase = (count - prev_count) / prev_count | ||||||||||||||||||||
| if increase > 0.5: | ||||||||||||||||||||
| # Spike definition: > 50% increase AND significant volume (> 5) | ||||||||||||||||||||
| if count > 5 and increase > 0.5: | ||||||||||||||||||||
| spikes.append(category) | ||||||||||||||||||||
| elif prev_count == 0 and count > 5: | ||||||||||||||||||||
| increase = float('inf') # Infinite increase | ||||||||||||||||||||
| spikes.append(category) # New surge | ||||||||||||||||||||
|
Comment on lines
+79
to
80
|
||||||||||||||||||||
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge | |
| # New category with significant volume; treat as a strong spike | |
| increase = float(count) | |
| spikes.append(category) # New surge |
Copilot
AI
Feb 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with the code block at the same level (lines 76-77).
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge | |
| increase = float('inf') # Infinite increase | |
| spikes.append(category) # New surge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Top emerging concern is selected from any positive increase, even if it fails the spike criteria (count <= 5 or increase <= 0.5). This can incorrectly highlight low-volume/non-spike categories. Restrict the max-spike tracking to the same spike criteria.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/civic_intelligence.py, line 83:
<comment>Top emerging concern is selected from any positive increase, even if it fails the spike criteria (count <= 5 or increase <= 0.5). This can incorrectly highlight low-volume/non-spike categories. Restrict the max-spike tracking to the same spike criteria.</comment>
<file context>
@@ -63,17 +63,29 @@ def run_daily_cycle(self):
spikes.append(category) # New surge
+ # Track the highest spike for "Emerging Concern"
+ if increase > max_spike_increase:
+ max_spike_increase = increase
+ top_spike_category = category
</file context>
| if increase > max_spike_increase: | |
| if count > 5 and increase > 0.5 and increase > max_spike_increase: |
Copilot
AI
Feb 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for tracking the highest spike for "Emerging Concern" includes all categories, not just those that qualify as spikes. This means a category with a small count (e.g., count <= 5) could be selected as the top spike even though it doesn't meet the spike criteria. For example, if a category goes from 1 to 2 (100% increase), it would have a higher increase than a category going from 10 to 20 (100% vs a proper spike), but wouldn't qualify as a spike due to low volume. Consider only tracking max_spike_increase and top_spike_category when the category actually qualifies as a spike (either lines 77 or 80).
Copilot
AI
Feb 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When current_dist is empty (no categories), the spike detection loop never executes, leaving top_spike_category as None. This is handled correctly by the fallback logic in _calculate_index (lines 229-234), but consider adding a test case to verify this scenario works correctly when both previous and current distributions are empty.
Copilot
AI
Feb 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with line 222 at the same indentation level.
| score_delta = round(score - previous_score, 1) | |
| score_delta = round(score - previous_score, 1) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,164 @@ | ||
| import pytest | ||
| import json | ||
| import os | ||
| from unittest.mock import MagicMock, patch, mock_open | ||
| from datetime import datetime, timedelta, timezone | ||
|
|
||
| from backend.models import Issue, EscalationAudit, Grievance | ||
| from backend.civic_intelligence import CivicIntelligenceEngine | ||
|
|
||
| @patch('backend.civic_intelligence.SessionLocal') | ||
| @patch('backend.civic_intelligence.trend_analyzer') | ||
| @patch('backend.civic_intelligence.adaptive_weights') | ||
| @patch('builtins.open', new_callable=mock_open) | ||
| @patch('json.dump') | ||
| @patch('os.listdir') | ||
| def test_civic_intelligence_index_delta(mock_listdir, mock_json_dump, mock_file_open, mock_weights, mock_trend_analyzer, mock_db_session): | ||
| engine = CivicIntelligenceEngine() | ||
|
|
||
| # Mock DB | ||
| mock_session = MagicMock() | ||
| mock_db_session.return_value = mock_session | ||
|
|
||
| # 1. Simulate Previous Snapshot with Score 70.0 | ||
| previous_snapshot_content = json.dumps({ | ||
| "civic_index": {"score": 70.0}, | ||
| "trends": {"category_distribution": {"Fire": 2, "Water": 5}} | ||
| }) | ||
|
|
||
| mock_listdir.return_value = ['2023-01-01.json'] | ||
|
|
||
| # Mock open to return previous snapshot content when reading | ||
| read_mock = mock_open(read_data=previous_snapshot_content) | ||
| write_mock = mock_open() | ||
|
|
||
| def open_side_effect(file, mode='r', *args, **kwargs): | ||
| if 'r' in mode: | ||
| return read_mock(file, mode, *args, **kwargs) | ||
| return write_mock(file, mode, *args, **kwargs) | ||
|
|
||
| mock_file_open.side_effect = open_side_effect | ||
|
|
||
| # 2. Simulate Current Data for Higher Score | ||
| # Mock Query Results | ||
| mock_query_issues = MagicMock() # For Issues | ||
| mock_query_audits = MagicMock() # For EscalationAudit | ||
| mock_query_grievances = MagicMock() # For Grievances | ||
|
|
||
| def query_side_effect(model): | ||
| if model == Issue: | ||
| return mock_query_issues | ||
| elif model == EscalationAudit: | ||
| return mock_query_audits | ||
| elif model == Grievance: | ||
| return mock_query_grievances | ||
| return MagicMock() | ||
|
|
||
| mock_session.query.side_effect = query_side_effect | ||
|
|
||
| # issues_24h query (new issues) | ||
| # The code calls: db.query(Issue).filter(Issue.created_at >= last_24h).all() | ||
| # And: db.query(Issue).filter(Issue.resolved_at >= last_24h).count() | ||
|
|
||
| # We need to distinguish between the two filter calls or just return something compatible | ||
| # Let's make the first call return a list, and the second a count | ||
|
|
||
| # Configure mock_query_issues to handle chained calls | ||
| # .filter().all() -> returns [Issue, Issue] | ||
| # .filter().count() -> returns 5 | ||
|
|
||
| mock_query_issues.filter.return_value.all.return_value = [Issue(id=1), Issue(id=2)] | ||
| mock_query_issues.filter.return_value.count.return_value = 5 | ||
|
|
||
| # Mock Escalation Audits (Empty list to avoid iteration error) | ||
| mock_query_audits.filter.return_value.all.return_value = [] | ||
|
|
||
| # Setup Trend Analyzer to return a spike | ||
| mock_trend_analyzer.analyze.return_value = { | ||
| "top_keywords": [], | ||
| "category_distribution": {"Fire": 10}, # Spiked from 2 | ||
| "clusters": [] | ||
| } | ||
|
|
||
| # Mock adaptive weights radius | ||
| mock_weights.get_duplicate_search_radius.return_value = 50.0 | ||
|
|
||
| # Run | ||
| engine.run_daily_cycle() | ||
|
|
||
| # Verify Snapshot Content | ||
| # Ensure json.dump was called | ||
| assert mock_json_dump.called | ||
| args, _ = mock_json_dump.call_args | ||
| snapshot = args[0] | ||
|
|
||
| index_data = snapshot['civic_index'] | ||
|
|
||
| # Check Score | ||
| # Base 70 + 10 - 1 = 79.0 | ||
| assert index_data['score'] == 79.0 | ||
|
|
||
| # Check Delta | ||
| # 79.0 - 70.0 = 9.0 | ||
| assert index_data['score_delta'] == 9.0 | ||
|
|
||
| # Check Emerging Concern | ||
| # Fire increased from 2 to 10 (>50% and >5 items) -> Should be top spike | ||
| assert index_data['top_emerging_concern'] == "Fire" | ||
|
Comment on lines
+105
to
+107
|
||
|
|
||
| @patch('backend.civic_intelligence.SessionLocal') | ||
| @patch('backend.civic_intelligence.trend_analyzer') | ||
| @patch('backend.civic_intelligence.adaptive_weights') | ||
| @patch('builtins.open', new_callable=mock_open) | ||
| @patch('json.dump') | ||
| @patch('os.listdir') | ||
| def test_civic_intelligence_no_previous_snapshot(mock_listdir, mock_json_dump, mock_file_open, mock_weights, mock_trend_analyzer, mock_db_session): | ||
| engine = CivicIntelligenceEngine() | ||
| mock_session = MagicMock() | ||
| mock_db_session.return_value = mock_session | ||
|
|
||
| # Simulate NO previous snapshot | ||
| mock_listdir.return_value = [] | ||
|
|
||
| # Write mock only | ||
| write_mock = mock_open() | ||
| mock_file_open.side_effect = lambda f, m='r', *a, **k: write_mock(f, m, *a, **k) | ||
|
|
||
| # Mock Query Results | ||
| mock_query_issues = MagicMock() # For Issues | ||
| mock_query_audits = MagicMock() # For EscalationAudit | ||
|
|
||
| def query_side_effect(model): | ||
| if model == Issue: | ||
| return mock_query_issues | ||
| elif model == EscalationAudit: | ||
| return mock_query_audits | ||
| return MagicMock() | ||
|
|
||
| mock_session.query.side_effect = query_side_effect | ||
|
|
||
| # Data: 10 resolved (+20), 0 new (0) => 90.0 | ||
| mock_query_issues.filter.return_value.all.return_value = [] # 0 new issues | ||
| mock_query_issues.filter.return_value.count.return_value = 10 # 10 resolved | ||
|
|
||
| mock_query_audits.filter.return_value.all.return_value = [] | ||
|
|
||
| mock_trend_analyzer.analyze.return_value = { | ||
| "category_distribution": {"Water": 10} | ||
| } | ||
|
|
||
| # Mock adaptive weights radius (return int/float) | ||
| mock_weights.get_duplicate_search_radius.return_value = 50.0 | ||
|
|
||
| engine.run_daily_cycle() | ||
|
|
||
| assert mock_json_dump.called | ||
| args, _ = mock_json_dump.call_args | ||
| snapshot = args[0] | ||
| index_data = snapshot['civic_index'] | ||
|
|
||
| assert index_data['score'] == 90.0 | ||
| assert index_data['score_delta'] == 0.0 # No previous snapshot, so delta 0 | ||
|
|
||
| # Since no previous snapshot, no spike detection base, so fallback to max volume | ||
| assert index_data['top_emerging_concern'] == "Water" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is inconsistent indentation. This line uses spaces that differ from the surrounding code's indentation pattern, which could cause syntax errors or maintenance issues. The line should be aligned with the code block at the same level (lines 76-77 and line 79).