Skip to content

feat(settings): configurable anomaly detector thresholds via PATCH /settings/anomaly/detectors#206

Open
amitkojha05 wants to merge 1 commit into
BetterDB-inc:masterfrom
amitkojha05:feat/configurable-anomaly-detector-thresholds
Open

feat(settings): configurable anomaly detector thresholds via PATCH /settings/anomaly/detectors#206
amitkojha05 wants to merge 1 commit into
BetterDB-inc:masterfrom
amitkojha05:feat/configurable-anomaly-detector-thresholds

Conversation

@amitkojha05
Copy link
Copy Markdown
Contributor

@amitkojha05 amitkojha05 commented May 17, 2026

Summary

Anomaly detector thresholds were hardcoded in proprietary/anomaly-detection/anomaly.service.ts and required a code change to tune — as noted in the anomaly detection tuning guide ("requires code change, or wait for configurable detectors"). This PR wires all per-metric thresholds through the settings API so operators can adjust sensitivity at runtime with no restart.

Changes

packages/shared

  • anomaly-detector-settings.types.ts — New. AnomalyDetectorConfigEntry and AnomalyDetectorConfigMap types shared across API and proprietary modules.
  • anomaly.ts — Adds reloadDetectorConfig(overrides) to IAnomalyService interface.
  • settings.types.ts — Adds anomalyDetectorConfig: AnomalyDetectorConfigMap to AppSettings.

apps/api/src/anomaly/anomaly.types.ts — New file.
MetricType enum (11 configurable metrics; replication_role and cpu_utilization excluded — state-diff and proprietary-only respectively). DetectorConfig interface, DETECTOR_DEFAULTS const mirroring previous hardcoded values, resolveDetectorConfig() merge helper, toSpikeDetectorConfig() adapter, DEFAULT_SPIKE_CONFIG for non-API metrics.

apps/api/src/settings/dto/update-anomaly-detectors.dto.ts — New file.
DetectorConfigDto with range validation on all 6 fields plus a custom CriticalGreaterThanWarningValidator cross-field guard. UpdateAnomalyDetectorsDto with one optional DetectorConfigDto per MetricType.

apps/api/src/settings/settings.service.ts
Two new methods: getDetectorConfig() reads stored overrides from cached settings. updateDetectorConfig() does field-level merge (not whole-metric replace), validates the fully resolved cross-field invariants (warningZScore < criticalZScore) against the merged + default state, then persists and returns the merged map.

apps/api/src/settings/settings.controller.ts

  • GET /settings/anomaly/detectors — returns { defaults, overrides, resolved }. resolved is every metric fully merged, safe to read directly.
  • PATCH /settings/anomaly/detectors — validates, persists, hot-reloads via anomalyService.reloadDetectorConfig(). AnomalyService injected as @Optional() for testability; startup warning logged if absent.
  • POST /settings/anomaly/detectors/reset — clears all overrides and hot-reloads to defaults.

Storage adapters (postgres, sqlite, memory, base-sql)
anomaly_detector_config column added via additive migration (SQLite) and IF NOT EXISTS (Postgres). Memory adapter handles the new field. All upserts include the new column.

proprietary/anomaly-detection/anomaly.service.ts

  • onModuleInit made async; loads stored overrides before polling begins.
  • All hardcoded per-metric SpikeDetectorConfig objects replaced with resolveSpikeConfig(metric) which routes through resolveDetectorConfig() for API metrics and DEFAULT_SPIKE_CONFIG for non-API metrics (cpu_utilization).
  • reloadDetectorConfig(overrides) swaps the override map and calls applyDetectorConfigToAllConnections() — iterates all live detectors and calls detector.updateConfig(). Circular buffers are not reset.
  • Lazy SLOWLOG_LAST_ID detector creation updated to use resolveSpikeConfig.

proprietary/anomaly-detection/spike-detector.ts
updateConfig(config) method added — replaces threshold fields in-place, preserves detectDrops.

apps/api/src/prometheus/prometheus.service.ts
betterdb_detector_config_updates_total counter incremented on each successful PATCH.

Behavior

  • Partial updates: only fields sent are changed; unspecified fields keep defaults.
  • Field-level merge: { connections: { warningZScore: 2.5 } } only changes that one field — consecutiveRequired, cooldownMs etc. are untouched.
  • Cross-field safety: service validates the fully resolved state (defaults + stored + new delta) before persisting, catching inversions introduced by partial PATCHes against stored config.
  • Persistence: overrides survive restarts on postgres and sqlite. Memory backend resets on restart — consistent with all other settings.
  • Hot-reload: anomaly poller picks up the new config on its next tick. Circular buffers are NOT reset — baselines are preserved.
  • Backward compatible: default values are identical to previous hardcoded values.

Testing

  • settings-anomaly-detectors.spec.ts — 8 unit tests: default resolution, partial override, field-level merge, invalid range (400), unknown metric key (400), warningZScore ≥ criticalZScore in payload (400), partial PATCH inverting thresholds
    against stored config (400), persist + hot-reload call chain.
  • proprietary/anomaly-detection/__tests__/anomaly.service.spec.ts — tests for reloadDetectorConfig and applyDetectorConfigToAllConnections.
  • Storage adapter fixtures updated with anomalyDetectorConfig: {} where needed.

API Reference

GET  /settings/anomaly/detectors
PATCH /settings/anomaly/detectors
POST  /settings/anomaly/detectors/reset
# View defaults, stored overrides, and fully resolved config per metric
curl http://localhost:3001/settings/anomaly/detectors

# Partial update — only changes these two fields for connections
curl -X PATCH http://localhost:3001/settings/anomaly/detectors \
  -H "Content-Type: application/json" \
  -d '{"connections": {"warningZScore": 2.5, "consecutiveRequired": 5}}'

# Reset all metrics to defaults
curl -X POST http://localhost:3001/settings/anomaly/detectors/reset

Note

Medium Risk
Introduces new persisted settings and hot-reload paths that directly affect anomaly detection behavior across running connections, plus DB schema migrations to add anomaly_detector_config.

Overview
Adds runtime-configurable per-metric anomaly detector thresholds via GET/PATCH /settings/anomaly/detectors and POST /settings/anomaly/detectors/reset, including DTO validation, default/override resolution, field-level merge semantics, and cross-field safety checks before persisting.

Persists overrides in AppSettings (anomaly_detector_config) across Postgres/SQLite/Memory adapters, updates the proprietary anomaly service to load overrides on startup and hot-reload detector configs in-place (new SpikeDetector.updateConfig), and adds a betterdb_detector_config_updates_total Prometheus counter. Also adds a SQLite guard to enforce uniqueness of pending cache_proposals where partial indexes may be unreliable.

Reviewed by Cursor Bugbot for commit 7e27cb0. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread apps/api/src/settings/dto/update-anomaly-detectors.dto.ts
Comment thread apps/api/src/anomaly/anomaly.types.ts
Comment thread apps/api/src/settings/dto/update-anomaly-detectors.dto.ts
@amitkojha05 amitkojha05 force-pushed the feat/configurable-anomaly-detector-thresholds branch from fe3a92b to 93684a1 Compare May 17, 2026 21:20
...DETECTOR_DEFAULTS[metric],
...overrides[metric],
};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial overrides can invert warning/critical Z-score ordering

Medium Severity

resolveDetectorConfig naively merges overrides onto defaults without validating cross-field invariants. Because the DTO validator (CriticalGreaterThanWarningValidator) only checks warningZScore < criticalZScore when both fields are in the same request payload, a partial PATCH like { connections: { warningZScore: 9.5 } } passes validation but produces a resolved config where warningZScore (9.5) exceeds the default criticalZScore (3.0). The spike detector then operates with an inverted threshold, likely suppressing warnings entirely or producing nonsensical severity classifications.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 93684a1. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in settings.service.tsupdateDetectorConfig now validates the fully resolved state (defaults + stored + new delta) before persisting.

@amitkojha05 amitkojha05 force-pushed the feat/configurable-anomaly-detector-thresholds branch from 93684a1 to 0f4986e Compare May 17, 2026 21:56
@amitkojha05 amitkojha05 changed the title Anomaly Detector Thresholds Configurable at Runtime via the /settings API feat(settings): configurable anomaly detector thresholds via PATCH /settings/anomaly/detectors May 17, 2026
@amitkojha05 amitkojha05 force-pushed the feat/configurable-anomaly-detector-thresholds branch from 0f4986e to 0fa7891 Compare May 19, 2026 13:16
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0fa7891. Configure here.

Comment thread apps/api/src/settings/settings.controller.ts Outdated
@amitkojha05 amitkojha05 force-pushed the feat/configurable-anomaly-detector-thresholds branch from 0fa7891 to 5a13892 Compare May 19, 2026 13:43
@amitkojha05
Copy link
Copy Markdown
Contributor Author

@KIvanow @jamby77 Please review this PR

Co-authored-by: Cursor <cursoragent@cursor.com>
@amitkojha05 amitkojha05 force-pushed the feat/configurable-anomaly-detector-thresholds branch from 5a13892 to 7e27cb0 Compare May 19, 2026 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant