Skip to content

[BACKEND] Add asynchronous scan jobs and per-scan status endpoint #112

@m-khan-97

Description

@m-khan-97

Objective

Move scan execution out of the request/response path and add a per-scan status endpoint so large Azure subscriptions do not time out the API or frontend.

Why this matters

POST /api/scans/trigger currently runs ScanEngine(...).run_scan() synchronously inside the Flask request. That works for small demos, but it is fragile for real subscriptions because Azure SDK calls, Microsoft Graph calls, and CVE enrichment can take longer than the web request timeout.

The frontend dashboard also expects scan triggering to return quickly and then poll for completion.

Current behaviour

  • POST /api/scans/trigger blocks until the scan finishes.
  • GET /api/scans returns historical scans only.
  • There is no GET /api/scans/<scan_id> endpoint for one scan status.
  • The scans table has no explicit status, error_message, or progress fields.

What to build

  1. Add scan status tracking to the database.
  2. Add GET /api/scans/<scan_id>.
  3. Change POST /api/scans/trigger to create a scan record and return quickly with HTTP 202.
  4. Run the scan in a background worker path appropriate for the deployment model.
  5. Persist failures without losing the scan record.

Suggested schema additions

  • scans.status: pending | running | completed | failed
  • scans.error_message: nullable text
  • Optional: scans.rules_completed, scans.total_rules

Files likely involved

  • api/routes/scans.py
  • api/models/finding.py
  • scanner/engine.py
  • startup.sh or deployment docs if a worker process is introduced
  • docs/api-reference.md
  • tests/smoke_test.py

Acceptance criteria

  • POST /api/scans/trigger returns 202 with scan_id and status without waiting for full scan completion
  • GET /api/scans/<scan_id> returns one scan with current status and timestamps
  • Failed scans persist status=failed and a safe error message
  • Existing GET /api/scans still works
  • Smoke tests cover trigger + status polling without requiring real Azure credentials
  • Documentation explains the async scan lifecycle

Metadata

Metadata

Assignees

Labels

coreCore team ownership not for studentsenhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

Status
📋 Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions