Architecture Decision Records

1. SSE (Server-Sent Events) for Triage Streaming

Decision: Use SSE instead of WebSockets for real-time triage updates.

Rationale:

Unidirectional communication (server → client) is sufficient
Built-in auto-reconnection with EventSource
Works seamlessly with HTTP/2 multiplexing
Simpler than WebSocket (no handshake complexity)
Better for read-heavy streaming scenarios

Trade-offs: No bidirectional communication, but not needed for our use case.

2. Keyset Pagination over Offset-based

Decision: Implement cursor-based (keyset) pagination using (customerId, timestamp) composite key.

Rationale:

Stable results even when data changes (no page drift)
O(1) performance regardless of page depth
Better for large datasets (1M+ rows)
Prevents duplicate/missing items on concurrent updates

Trade-offs: Cannot jump to arbitrary page numbers, but forward/backward navigation works well.

3. Circuit Breaker Pattern for External Tools

Decision: Implement circuit breaker with 30s timeout after 3 consecutive failures per agent.

Rationale:

Prevents cascading failures when risk/fraud APIs are down
Allows system to degrade gracefully with fallbacks
Auto-recovery after timeout period
Protects downstream services from overload

Trade-offs: Brief service degradation during recovery window.

4. Deterministic Fallbacks for All Agents

Decision: Every agent has rule-based fallback logic (no LLM dependency required).

Rationale:

System works offline without external API calls
Predictable behavior for testing and evaluation
Faster response times (no network latency)
Compliance-friendly (no data leaves infrastructure)

Trade-offs: Less sophisticated insights compared to LLM-powered analysis.

5. Virtual Scrolling for Large Tables

Decision: Use TanStack Virtual for alert/transaction tables (2k+ rows).

Rationale:

Renders only visible rows (~20-30 DOM nodes vs 2000+)
Eliminates scroll jank and memory bloat
Maintains 60fps scrolling performance
Works with dynamic row heights

Trade-offs: Slight complexity in implementation, but huge performance gain.

6. Prisma ORM over Raw SQL

Decision: Use Prisma for type-safe database access with TypeScript.

Rationale:

Compile-time type safety (no runtime query errors)
Auto-generated types from schema
Migration management built-in
Developer productivity (autocomplete, refactoring)

Trade-offs: Slight performance overhead vs raw SQL, but negligible for our scale.

7. Redis for Rate Limiting and Caching

Decision: Implement token bucket rate limiter in Redis (5 req/sec per client).

Rationale:

Distributed state across API instances
Atomic operations (INCR, EXPIRE) prevent race conditions
Sub-millisecond latency for checks
TTL-based cleanup (no manual garbage collection)

Trade-offs: Additional infrastructure dependency, but essential for multi-instance deployments.

8. Idempotency Keys for Mutations

Decision: Require Idempotency-Key header for all state-changing operations.

Rationale:

Prevents duplicate actions on network retries
Safe to retry failed requests
Audit trail links multiple attempts to same logical action
Industry best practice (Stripe, Twilio, etc.)

Trade-offs: Clients must generate unique keys, but prevents costly mistakes.

9. PII Redaction Pipeline

Decision: Redact PAN (13-19 digit sequences) and mask emails in all logs/traces/UI.

Rationale:

PCI-DSS compliance requirement
Defense-in-depth (multiple layers of redaction)
Prevents accidental exposure in logs/monitoring
Required for audit trail security

Trade-offs: Cannot reconstruct original data from logs (intentional).

10. Prometheus Metrics over Custom Solution

Decision: Export metrics in Prometheus format via /metrics endpoint.

Rationale:

Industry standard for observability
Rich ecosystem (Grafana, AlertManager, etc.)
Pull-based model (no client config needed)
Built-in aggregation and alerting

Trade-offs: Requires Prometheus server for visualization, but widely adopted.

11. Docker Compose for Local Development

Decision: Single docker-compose.yml brings up all services (Postgres, Redis, API, Web).

Rationale:

One command to start entire stack
Consistent environment across developers
Easy cleanup and reset
Production-like local setup

Trade-offs: Higher resource usage than native processes, but worth consistency.

12. Monorepo Structure

Decision: Keep client and server in same repository with shared types.

Rationale:

Atomic commits across frontend/backend
Shared TypeScript types (API contracts)
Simplified CI/CD (single build pipeline)
Easier code reviews (see both sides of changes)

Trade-offs: Larger repository, but better developer experience.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture Decision Records

1. SSE (Server-Sent Events) for Triage Streaming

2. Keyset Pagination over Offset-based

3. Circuit Breaker Pattern for External Tools

4. Deterministic Fallbacks for All Agents

5. Virtual Scrolling for Large Tables

6. Prisma ORM over Raw SQL

7. Redis for Rate Limiting and Caching

8. Idempotency Keys for Mutations

9. PII Redaction Pipeline

10. Prometheus Metrics over Custom Solution

11. Docker Compose for Local Development

12. Monorepo Structure

FilesExpand file tree

ADR.md

Latest commit

History

ADR.md

File metadata and controls

Architecture Decision Records

1. SSE (Server-Sent Events) for Triage Streaming

2. Keyset Pagination over Offset-based

3. Circuit Breaker Pattern for External Tools

4. Deterministic Fallbacks for All Agents

5. Virtual Scrolling for Large Tables

6. Prisma ORM over Raw SQL

7. Redis for Rate Limiting and Caching

8. Idempotency Keys for Mutations

9. PII Redaction Pipeline

10. Prometheus Metrics over Custom Solution

11. Docker Compose for Local Development

12. Monorepo Structure