From b213ffb0615fbef4fb9b32176c32a14f946ea155 Mon Sep 17 00:00:00 2001
From: Amber Agent <amber@ambient-code.ai>
Date: Mon, 8 Dec 2025 19:22:34 +0000
Subject: [PATCH] docs: Add comprehensive architecture diagrams
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add visual architecture documentation with Mermaid diagrams covering:
- Core 4-component system architecture with data flows
- Agentic session lifecycle state machine and reconciliation
- Multi-tenancy architecture with namespace isolation and RBAC
- Kubernetes Custom Resource structures and relationships

Created docs/architecture/ directory with 5 files (3,155 lines):
- index.md: Architecture navigation and quick start guide
- core-system-architecture.md: System overview and component interactions
- agentic-session-lifecycle.md: Session states and operator patterns
- multi-tenancy-architecture.md: Project isolation and security
- kubernetes-resources.md: CRD schemas and resource lifecycle

Updated existing documentation:
- docs/index.md: Added architecture section to main navigation
- CLAUDE.md: Added references to architecture diagrams
- mkdocs.yml: Integrated architecture pages into site navigation

All diagrams use Mermaid format for GitHub/MkDocs compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---
 CLAUDE.md                                     |    6 +
 .../architecture/agentic-session-lifecycle.md |  611 ++++++++++
 docs/architecture/core-system-architecture.md |  402 +++++++
 docs/architecture/index.md                    |  344 ++++++
 docs/architecture/kubernetes-resources.md     | 1042 +++++++++++++++++
 .../multi-tenancy-architecture.md             |  756 ++++++++++++
 docs/index.md                                 |    9 +
 mkdocs.yml                                    |    6 +
 8 files changed, 3176 insertions(+)
 create mode 100644 docs/architecture/agentic-session-lifecycle.md
 create mode 100644 docs/architecture/core-system-architecture.md
 create mode 100644 docs/architecture/index.md
 create mode 100644 docs/architecture/kubernetes-resources.md
 create mode 100644 docs/architecture/multi-tenancy-architecture.md
diff --git a/CLAUDE.md b/CLAUDE.md
index 5f8d5fb47..0adeea4d0 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -40,6 +40,12 @@ User Creates Session → Backend Creates CR → Operator Spawns Job →
 Pod Runs Claude CLI → Results Stored in CR → UI Displays Progress
 ```
 
+📐 **Architecture Diagrams:** See [docs/architecture/](docs/architecture/) for comprehensive visual guides including:
+- [Core System Architecture](docs/architecture/core-system-architecture.md) - 4-component system with data flows
+- [Agentic Session Lifecycle](docs/architecture/agentic-session-lifecycle.md) - State machine and reconciliation
+- [Multi-Tenancy Architecture](docs/architecture/multi-tenancy-architecture.md) - Project isolation and RBAC
+- [Kubernetes Resources](docs/architecture/kubernetes-resources.md) - CRD structures and relationships
+
 ## Memory System - Loadable Context
 
 This repository uses a structured **memory system** to provide targeted, loadable context instead of relying solely on this comprehensive CLAUDE.md file.
diff --git a/docs/architecture/agentic-session-lifecycle.md b/docs/architecture/agentic-session-lifecycle.md
new file mode 100644
index 000000000..974982175
--- /dev/null
+++ b/docs/architecture/agentic-session-lifecycle.md
@@ -0,0 +1,611 @@
+# Agentic Session Lifecycle
+
+## Overview
+
+An **AgenticSession** represents a single AI-powered automation task. This document describes the complete lifecycle from creation to completion, including state transitions, operator reconciliation, and error handling.
+
+## State Machine
+
+```mermaid
+stateDiagram-v2
+    [*] --> Pending: User creates session<br/>(Backend creates CR)
+
+    Pending --> Running: Operator creates Job<br/>Pod starts execution
+
+    Running --> Completed: Job succeeds<br/>Results captured
+    Running --> Failed: Job fails<br/>Error captured
+    Running --> Timeout: Timeout exceeded<br/>Job terminated
+
+    Completed --> [*]
+    Failed --> [*]
+    Timeout --> [*]
+
+    note right of Pending
+        Initial state
+        - CR exists
+        - No Job created yet
+        - Operator will reconcile
+    end note
+
+    note right of Running
+        Active execution
+        - Job created
+        - Pod running
+        - Results streaming
+        - Status updates frequent
+    end note
+
+    note right of Completed
+        Success terminal state
+        - Results in CR status
+        - Job succeeded
+        - Resources cleaned up
+    end note
+
+    note right of Failed
+        Error terminal state
+        - Error message in CR
+        - Job failed
+        - Resources cleaned up
+    end note
+
+    note right of Timeout
+        Timeout terminal state
+        - Job terminated
+        - Partial results captured
+        - Resources cleaned up
+    end note
+```
+
+## Phase Descriptions
+
+### Pending
+
+**Entry Condition:** Backend API creates AgenticSession CR
+
+**State Characteristics:**
+- CR exists with `spec` populated
+- No `status` or `status.phase = "Pending"`
+- No Job created yet
+- No Pod running
+
+**Next Transition:** Operator detects CR and creates Job → `Running`
+
+**Typical Duration:** 1-5 seconds
+
+---
+
+### Running
+
+**Entry Condition:** Operator creates Job successfully
+
+**State Characteristics:**
+- Job exists with OwnerReference to AgenticSession
+- Pod scheduled and executing
+- `status.phase = "Running"`
+- `status.startTime` set
+- `status.results` may contain partial results
+
+**Status Updates:**
+- Operator monitors Job status every 5 seconds
+- Runner updates CR with progress logs
+- WebSocket broadcasts updates to frontend
+
+**Next Transitions:**
+- Job succeeds → `Completed`
+- Job fails → `Failed`
+- Timeout exceeded → `Timeout`
+
+**Typical Duration:** 30 seconds to 2 hours (configurable)
+
+---
+
+### Completed
+
+**Entry Condition:** Job completes successfully (exit code 0)
+
+**State Characteristics:**
+- `status.phase = "Completed"`
+- `status.completionTime` set
+- `status.results` contains final output
+- Per-repo `pushed` or `abandoned` status
+- Job and Pod cleaned up (OwnerReference cascade)
+
+**Terminal State:** No further transitions
+
+**Typical Retention:** CR persists for audit/history (manual deletion or TTL)
+
+---
+
+### Failed
+
+**Entry Condition:** Job fails (non-zero exit code)
+
+**State Characteristics:**
+- `status.phase = "Failed"`
+- `status.completionTime` set
+- `status.message` contains error details
+- `status.results` may contain partial output
+- Job and Pod cleaned up
+
+**Common Failure Reasons:**
+- Invalid Anthropic API key
+- Git authentication failure
+- Runner execution error
+- Resource limits exceeded
+
+**Terminal State:** No further transitions
+
+**Typical Retention:** CR persists for debugging (manual deletion)
+
+---
+
+### Timeout
+
+**Entry Condition:** Execution exceeds configured timeout
+
+**State Characteristics:**
+- `status.phase = "Timeout"`
+- `status.completionTime` set
+- `status.message` indicates timeout
+- `status.results` contains partial output
+- Job terminated by operator
+- Pod cleaned up
+
+**Timeout Configuration:**
+- Default: 1 hour
+- Configurable via `spec.timeout` (seconds)
+- ProjectSettings can set default per project
+
+**Terminal State:** No further transitions
+
+---
+
+## Operator Reconciliation Flow
+
+```mermaid
+flowchart TD
+    Start([Watch Event:<br/>AgenticSession Added/Modified])
+
+    Start --> GetCR[Get current CR from API]
+    GetCR --> Exists{CR exists?}
+
+    Exists -->|No - IsNotFound| LogDelete[Log: Resource deleted]
+    LogDelete --> End([Return - Not an error])
+
+    Exists -->|Yes| GetPhase[Extract status.phase]
+    GetPhase --> CheckPhase{phase?}
+
+    CheckPhase -->|Pending| CheckJob{Job exists?}
+    CheckPhase -->|Running| MonitorJob[Continue monitoring<br/>goroutine exists]
+    CheckPhase -->|Completed/Failed/Timeout| End
+
+    CheckJob -->|Yes| LogExists[Log: Job already exists]
+    LogExists --> End
+
+    CheckJob -->|No| CreateJob[Create Job with:<br/>- OwnerReference<br/>- Runner image<br/>- Env vars from ProjectSettings<br/>- PVC mount]
+
+    CreateJob --> JobCreated{Job created?}
+
+    JobCreated -->|No| UpdateError[Update CR status:<br/>phase=Failed<br/>message=error]
+    UpdateError --> End
+
+    JobCreated -->|Yes| UpdateRunning[Update CR status:<br/>phase=Running<br/>startTime=now]
+    UpdateRunning --> StartMonitor[Start goroutine:<br/>monitorJob]
+    StartMonitor --> End
+
+    MonitorJob --> End
+
+    style Start fill:#e1f5ff
+    style End fill:#e1ffe1
+    style CheckPhase fill:#fff4e1
+    style CreateJob fill:#f0e1ff
+    style UpdateError fill:#ffe1e1
+    style UpdateRunning fill:#e1ffe1
+```
+
+## Job Monitoring Loop
+
+```mermaid
+sequenceDiagram
+    participant Op as Operator<br/>(goroutine)
+    participant K8s as Kubernetes API
+    participant CR as AgenticSession CR
+    participant Job as Job
+    participant Pod as Pod
+
+    Note over Op: Started by<br/>reconciliation loop
+
+    loop Every 5 seconds
+        Op->>CR: Check if CR still exists
+
+        alt CR deleted
+            CR-->>Op: IsNotFound error
+            Note over Op: Exit goroutine<br/>(session deleted by user)
+        end
+
+        Op->>Job: Get Job status
+
+        alt Job deleted
+            Job-->>Op: IsNotFound error
+            Op->>CR: Update status:<br/>phase=Failed<br/>message="Job was deleted"
+            Note over Op: Exit goroutine
+        end
+
+        Job-->>Op: Job status
+
+        alt Job succeeded
+            Op->>CR: Update status:<br/>phase=Completed<br/>completionTime=now
+            Op->>Job: Delete Job<br/>(cleanup)
+            Note over Op: Exit goroutine<br/>(success)
+
+        else Job failed
+            Op->>Pod: Get Pod logs<br/>(last 100 lines)
+            Pod-->>Op: Error logs
+            Op->>CR: Update status:<br/>phase=Failed<br/>message=error<br/>results=logs
+            Op->>Job: Delete Job<br/>(cleanup)
+            Note over Op: Exit goroutine<br/>(failure)
+
+        else Job still running
+            Op->>CR: Update status:<br/>progress info
+            Note over Op: Continue monitoring
+        end
+
+        Note over Op: Check timeout
+        alt Timeout exceeded
+            Op->>Job: Delete Job<br/>(terminate)
+            Op->>CR: Update status:<br/>phase=Timeout<br/>message="Exceeded timeout"
+            Note over Op: Exit goroutine<br/>(timeout)
+        end
+    end
+```
+
+## Status Update Patterns
+
+### Operator Status Updates
+
+**Use Case:** Operator updates phase transitions
+
+**Pattern:** Update via `/status` subresource
+
+```go
+// components/operator/internal/handlers/sessions.go
+func updateAgenticSessionStatus(namespace, name string, updates map[string]interface{}) error {
+    gvr := types.GetAgenticSessionResource()
+
+    // Get current CR
+    obj, err := config.DynamicClient.Resource(gvr).
+        Namespace(namespace).
+        Get(ctx, name, v1.GetOptions{})
+
+    if errors.IsNotFound(err) {
+        log.Printf("CR deleted, skipping status update")
+        return nil  // Not an error
+    }
+
+    // Initialize status if needed
+    if obj.Object["status"] == nil {
+        obj.Object["status"] = make(map[string]interface{})
+    }
+
+    status := obj.Object["status"].(map[string]interface{})
+    for k, v := range updates {
+        status[k] = v
+    }
+
+    // Update via /status subresource
+    _, err = config.DynamicClient.Resource(gvr).
+        Namespace(namespace).
+        UpdateStatus(ctx, obj, v1.UpdateOptions{})
+
+    if errors.IsNotFound(err) {
+        return nil  // CR deleted during update
+    }
+
+    return err
+}
+```
+
+### Runner Status Updates
+
+**Use Case:** Runner pod updates results incrementally
+
+**Pattern:** Runner has minted token with limited permissions
+
+```python
+# components/runners/claude-code-runner/runner.py
+def update_session_status(results: Dict[str, Any]):
+    """Update CR status from runner pod."""
+    try:
+        # Use minted token from Secret
+        token = os.environ.get("RUNNER_TOKEN")
+
+        # Update via Kubernetes API
+        response = requests.patch(
+            f"{k8s_api}/apis/vteam.ambient-code/v1alpha1/namespaces/{namespace}/agenticsessions/{name}/status",
+            headers={"Authorization": f"Bearer {token}"},
+            json={"status": {"results": results}}
+        )
+
+        response.raise_for_status()
+    except Exception as e:
+        log.error(f"Failed to update status: {e}")
+        # Non-fatal: operator will update eventually
+```
+
+## Resource Lifecycle and Cleanup
+
+```mermaid
+graph TD
+    subgraph "Resource Creation"
+        CR[AgenticSession CR<br/>Created by Backend]
+        Job[Job<br/>Created by Operator]
+        Pod[Pod<br/>Created by Job Controller]
+        Secret[Secret<br/>Minted token]
+        PVC[PVC<br/>Workspace storage]
+    end
+
+    subgraph "OwnerReferences"
+        CR -->|controller=true| Job
+        Job -->|controller=true| Pod
+        CR -->|controller=true| Secret
+    end
+
+    subgraph "Cleanup Scenarios"
+        Delete1[User deletes CR]
+        Delete2[Job completes<br/>Operator deletes Job]
+        TTL[TTL expired<br/>K8s deletes CR]
+    end
+
+    Delete1 --> CascadeDelete1[Kubernetes cascades:<br/>Job → Pod → Secret]
+    Delete2 --> NormalCleanup[Operator deletes Job<br/>Pod cleaned by Job controller]
+    TTL --> CascadeDelete2[Same as user delete]
+
+    style CR fill:#ffe1e1
+    style Job fill:#fff4e1
+    style Pod fill:#e1ffe1
+    style Secret fill:#f0e1ff
+    style Delete1 fill:#ffe1e1
+    style Delete2 fill:#e1ffe1
+```
+
+**Key Cleanup Principles:**
+
+1. **OwnerReferences** ensure automatic cleanup when parent is deleted
+2. **Controller=true** on primary owner (only one per resource)
+3. **No BlockOwnerDeletion** (causes permission issues in multi-tenant)
+4. Operator explicitly deletes Jobs on completion (don't wait for cascade)
+5. PVCs persist for debugging (manual cleanup or TTL)
+
+**Reference:** [Backend/Operator Development Standards](../../CLAUDE.md#resource-management)
+
+---
+
+## Error Handling Patterns
+
+### Non-Fatal Errors (Operator)
+
+**Scenario:** Resource deleted during processing
+
+```go
+if errors.IsNotFound(err) {
+    log.Printf("AgenticSession %s no longer exists, skipping", name)
+    return nil  // Not treated as error - user deleted it
+}
+```
+
+### Retriable Errors (Operator)
+
+**Scenario:** Transient K8s API failure
+
+```go
+if err != nil {
+    log.Printf("Failed to create Job: %v", err)
+    updateAgenticSessionStatus(ns, name, map[string]interface{}{
+        "phase":   "Error",
+        "message": fmt.Sprintf("Failed to create Job: %v", err),
+    })
+    return fmt.Errorf("failed to create Job: %w", err)
+    // Operator watch loop will retry on next event
+}
+```
+
+### Terminal Errors (Runner)
+
+**Scenario:** Invalid API key
+
+```python
+try:
+    client = anthropic.Anthropic(api_key=api_key)
+    response = client.messages.create(...)
+except anthropic.AuthenticationError as e:
+    # Update CR with terminal error
+    update_session_status({
+        "phase": "Failed",
+        "message": f"Invalid Anthropic API key: {e}",
+        "completionTime": datetime.now().isoformat()
+    })
+    sys.exit(1)  # Exit pod with failure
+```
+
+---
+
+## Interactive vs Batch Execution
+
+### Batch Mode (Default)
+
+**Characteristics:**
+- Single prompt execution
+- Timeout enforced (default 1 hour)
+- Results written to CR on completion
+- Pod exits after execution
+
+**Use Cases:**
+- One-off automation tasks
+- Scripted workflows
+- RFE generation
+
+**Flow:**
+```
+User → Prompt → Runner executes → Results → Pod exits
+```
+
+---
+
+### Interactive Mode
+
+**Characteristics:**
+- Long-running session (no timeout)
+- User sends messages via inbox file
+- Runner responds via outbox file
+- Pod continues running until explicitly stopped
+
+**Use Cases:**
+- Iterative development
+- Multi-turn conversations
+- Complex debugging sessions
+
+**Flow:**
+```
+User → Initial prompt → Runner starts
+  ↓
+User writes to inbox → Runner reads → Executes → Writes to outbox
+  ↓
+User reads outbox → Continues conversation...
+  ↓
+User signals completion → Pod exits
+```
+
+**Configuration:**
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: AgenticSession
+metadata:
+  name: interactive-session
+spec:
+  interactive: true  # Enable interactive mode
+  prompt: "Initial prompt"
+  repos:
+    - input:
+        url: https://github.com/org/repo
+        branch: main
+```
+
+**File Locations:**
+- Inbox: `/workspace/inbox.txt` (user writes)
+- Outbox: `/workspace/outbox.txt` (runner writes)
+- Workspace: `/workspace/repos/` (cloned repositories)
+
+---
+
+## Multi-Repo Execution
+
+```mermaid
+flowchart LR
+    subgraph "AgenticSession Spec"
+        MainIdx[mainRepoIndex: 1]
+        Repos[repos array:<br/>0: repo-A<br/>1: repo-B<br/>2: repo-C]
+    end
+
+    subgraph "Runner Workspace"
+        WS[/workspace/repos/]
+        RepoA[repo-A/<br/>cloned from repos[0]]
+        RepoB[repo-B/<br/>cloned from repos[1]<br/>WORKING DIRECTORY]
+        RepoC[repo-C/<br/>cloned from repos[2]]
+    end
+
+    subgraph "Status Tracking"
+        StatusA[repos[0].status:<br/>pushed=true]
+        StatusB[repos[1].status:<br/>pushed=true]
+        StatusC[repos[2].status:<br/>abandoned=true]
+    end
+
+    MainIdx -->|Specifies| RepoB
+    Repos --> WS
+    WS --> RepoA
+    WS --> RepoB
+    WS --> RepoC
+
+    RepoA -.-> StatusA
+    RepoB -.-> StatusB
+    RepoC -.-> StatusC
+
+    style RepoB fill:#e1ffe1
+    style MainIdx fill:#fff4e1
+```
+
+**Key Concepts:**
+
+1. **mainRepoIndex** (default: 0): Sets Claude Code working directory
+2. **Cloning Order**: Repos cloned in array order
+3. **Per-Repo Status**: Each repo tracked individually (pushed/abandoned)
+4. **Cross-Repo References**: Claude can access all repos in workspace
+
+**Reference:** [ADR-0003: Multi-Repository Support](../adr/0003-multi-repo-support.md)
+
+---
+
+## Timeout Handling
+
+### Timeout Configuration
+
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: AgenticSession
+spec:
+  timeout: 3600  # seconds (1 hour)
+```
+
+**Timeout Sources (priority order):**
+1. `spec.timeout` on AgenticSession CR
+2. `defaultTimeout` in ProjectSettings CR
+3. Global default (1 hour)
+
+### Timeout Enforcement
+
+**Operator monitors elapsed time:**
+
+```go
+func monitorJob(jobName, sessionName, namespace string) {
+    startTime := time.Now()
+    timeout := getTimeoutForSession(namespace, sessionName)
+
+    for {
+        time.Sleep(5 * time.Second)
+
+        elapsed := time.Since(startTime)
+        if elapsed > timeout {
+            log.Printf("Session %s exceeded timeout (%v)", sessionName, timeout)
+
+            // Terminate Job
+            deleteJob(namespace, jobName)
+
+            // Update CR status
+            updateAgenticSessionStatus(namespace, sessionName, map[string]interface{}{
+                "phase": "Timeout",
+                "message": fmt.Sprintf("Exceeded timeout of %v", timeout),
+                "completionTime": time.Now().Format(time.RFC3339),
+            })
+
+            return  // Exit monitoring
+        }
+
+        // ... check Job status ...
+    }
+}
+```
+
+**Graceful Shutdown:**
+- Runner receives SIGTERM from Kubernetes
+- Runner captures partial results
+- Runner updates CR status before exit
+
+---
+
+## Related Documentation
+
+- [Core System Architecture](./core-system-architecture.md) - Component overview
+- [Kubernetes Resources](./kubernetes-resources.md) - CR schemas
+- [Multi-Tenancy Architecture](./multi-tenancy-architecture.md) - Project isolation
+- [Operator Development Standards](../../CLAUDE.md#operator-patterns)
+- [ADR-0001: Kubernetes-Native Architecture](../adr/0001-kubernetes-native-architecture.md)
diff --git a/docs/architecture/core-system-architecture.md b/docs/architecture/core-system-architecture.md
new file mode 100644
index 000000000..5ae70e79f
--- /dev/null
+++ b/docs/architecture/core-system-architecture.md
@@ -0,0 +1,402 @@
+# Core System Architecture
+
+## Overview
+
+The Ambient Code Platform follows a Kubernetes-native architecture with four primary components that work together to orchestrate AI-powered automation tasks.
+
+## High-Level Architecture
+
+```mermaid
+graph TB
+    subgraph "User Interface"
+        UI[Frontend<br/>NextJS + Shadcn UI<br/>React Query]
+    end
+
+    subgraph "API Layer"
+        API[Backend API<br/>Go + Gin<br/>REST + WebSocket]
+    end
+
+    subgraph "Kubernetes Cluster"
+        subgraph "Control Plane"
+            OP[Agentic Operator<br/>Go Controller<br/>Watches CRs]
+        end
+
+        subgraph "Custom Resources"
+            AS[AgenticSession<br/>CR]
+            PS[ProjectSettings<br/>CR]
+            RFE[RFEWorkflow<br/>CR]
+        end
+
+        subgraph "Execution"
+            JOB[Kubernetes Job]
+            POD[Runner Pod<br/>Python + Claude SDK]
+            PVC[Persistent Volume<br/>Workspace Storage]
+        end
+    end
+
+    UI -->|HTTP/HTTPS<br/>REST API + WS| API
+    API -->|K8s Dynamic Client<br/>User Token| AS
+    API -->|K8s Dynamic Client<br/>User Token| PS
+    API -->|K8s Dynamic Client<br/>User Token| RFE
+
+    OP -->|Watches| AS
+    OP -->|Watches| PS
+    OP -->|Watches| RFE
+
+    OP -->|Creates & Monitors| JOB
+    JOB -->|Spawns| POD
+    POD -->|Mounts| PVC
+
+    POD -->|Updates Status| AS
+    OP -->|Updates Status| AS
+
+    AS -.->|OwnerReference| JOB
+    JOB -.->|OwnerReference| POD
+
+    style UI fill:#e1f5ff
+    style API fill:#fff4e1
+    style OP fill:#f0e1ff
+    style POD fill:#e1ffe1
+    style AS fill:#ffe1e1
+    style PS fill:#ffe1e1
+    style RFE fill:#ffe1e1
+```
+
+## Component Breakdown
+
+### 1. Frontend (NextJS + Shadcn UI)
+
+**Technology Stack:**
+- NextJS 14+ with App Router
+- Shadcn UI component library
+- React Query for data fetching
+- TypeScript for type safety
+
+**Responsibilities:**
+- User interface for session management
+- Real-time status updates via WebSocket
+- Project and settings management
+- RFE workflow visualization
+
+**Key Patterns:**
+- Server-side rendering for performance
+- Optimistic updates with React Query
+- Type-safe API client integration
+
+**Reference:** [Frontend Development Standards](../../CLAUDE.md#frontend-development-standards)
+
+---
+
+### 2. Backend API (Go + Gin)
+
+**Technology Stack:**
+- Go 1.21+
+- Gin web framework
+- Kubernetes Dynamic Client
+- OpenShift OAuth integration
+
+**Responsibilities:**
+- REST API for CRUD operations on Custom Resources
+- WebSocket server for real-time updates
+- Multi-tenant project isolation (namespace mapping)
+- User authentication and authorization (RBAC)
+- Git operations (clone, fork, PR creation)
+
+**Key Endpoints:**
+- `/api/projects/:project/agentic-sessions` - Session management
+- `/api/projects/:project/project-settings` - Configuration
+- `/api/projects/:project/rfe-workflows` - RFE orchestration
+- `/ws` - WebSocket for real-time updates
+
+**Key Patterns:**
+- User token authentication for all operations
+- Project-scoped endpoints with RBAC validation
+- Middleware chain: Recovery → Logging → CORS → Auth → Validation
+- Error handling with structured responses
+
+**Reference:** [Backend Development Standards](../../CLAUDE.md#backend-and-operator-development-standards)
+
+---
+
+### 3. Agentic Operator (Go Controller)
+
+**Technology Stack:**
+- Go 1.21+
+- Kubernetes controller-runtime patterns
+- Watch/reconciliation loop
+- Custom Resource Definitions (CRDs)
+
+**Responsibilities:**
+- Watch AgenticSession, ProjectSettings, RFEWorkflow CRs
+- Reconcile desired state with actual state
+- Create and manage Kubernetes Jobs for session execution
+- Monitor Job completion and update CR status
+- Handle timeouts and cleanup
+
+**Reconciliation Flow:**
+1. Watch for CR events (Added, Modified, Deleted)
+2. Check resource phase (Pending, Running, Completed, Failed)
+3. Create Job if phase is Pending
+4. Monitor Job status and update CR
+5. Handle errors and retries with exponential backoff
+
+**Key Patterns:**
+- Reconnection logic for watch failures
+- Idempotent resource creation
+- OwnerReferences for automatic cleanup
+- Status updates via `/status` subresource
+- Goroutine monitoring for long-running jobs
+
+**Reference:** [Operator Development Standards](../../CLAUDE.md#operator-patterns)
+
+---
+
+### 4. Claude Code Runner (Python)
+
+**Technology Stack:**
+- Python 3.11+
+- Claude Code SDK (≥0.0.23)
+- Anthropic API (≥0.68.0)
+- Git integration
+
+**Responsibilities:**
+- Execute Claude Code CLI in containerized environment
+- Manage workspace synchronization via PVC
+- Handle interactive vs. batch execution modes
+- Capture results and update CR status
+- Multi-agent collaboration coordination
+
+**Execution Modes:**
+- **Batch Mode:** Single prompt execution with timeout
+- **Interactive Mode:** Long-running chat using inbox/outbox files
+
+**Key Patterns:**
+- Workspace isolation per session
+- Multi-repo support with mainRepoIndex
+- Result capture and structured output
+- Error propagation to operator
+
+**Reference:** [Runner Documentation](../../components/runners/claude-code-runner/README.md)
+
+---
+
+## Data Flow: Agentic Session Execution
+
+```mermaid
+sequenceDiagram
+    actor User
+    participant UI as Frontend
+    participant API as Backend API
+    participant K8s as Kubernetes API
+    participant Op as Operator
+    participant Job as Job/Pod
+    participant CR as AgenticSession CR
+
+    User->>UI: Create Session
+    UI->>API: POST /api/projects/{project}/agentic-sessions
+
+    Note over API: Extract user token<br/>Validate RBAC permissions
+
+    API->>K8s: Create AgenticSession CR<br/>(using user token)
+    K8s-->>API: CR Created (UID)
+    API-->>UI: 201 Created {name, uid}
+
+    Note over Op: Watch loop detects<br/>new CR event
+
+    Op->>K8s: Get AgenticSession CR
+    K8s-->>Op: CR with phase=Pending
+
+    Op->>K8s: Create Job with OwnerReference
+    Note over Op: Set controller=true<br/>for automatic cleanup
+
+    K8s-->>Op: Job Created
+    Op->>K8s: Update CR status<br/>phase=Running
+
+    K8s->>Job: Schedule Pod
+
+    Note over Job: Runner executes<br/>Claude Code CLI
+
+    loop Monitoring
+        Op->>K8s: Check Job status
+        K8s-->>Op: Job status (running/succeeded/failed)
+
+        Op->>K8s: Update CR status<br/>(progress, logs, errors)
+    end
+
+    Job->>K8s: Update CR status<br/>(results, completionTime)
+
+    Op->>K8s: Update CR status<br/>phase=Completed
+
+    K8s-->>API: Status change event
+    API-->>UI: WebSocket update
+    UI-->>User: Display results
+```
+
+## Multi-Tenancy Model
+
+```mermaid
+graph LR
+    subgraph "Project A"
+        PA[Project 'team-alpha']
+        NSA[Namespace: team-alpha]
+        ASA1[AgenticSession-1]
+        ASA2[AgenticSession-2]
+        PSA[ProjectSettings]
+    end
+
+    subgraph "Project B"
+        PB[Project 'team-beta']
+        NSB[Namespace: team-beta]
+        ASB1[AgenticSession-1]
+        PSB[ProjectSettings]
+    end
+
+    PA -->|Maps to| NSA
+    PB -->|Maps to| NSB
+
+    NSA -->|Contains| ASA1
+    NSA -->|Contains| ASA2
+    NSA -->|Contains| PSA
+
+    NSB -->|Contains| ASB1
+    NSB -->|Contains| PSB
+
+    style PA fill:#e1f5ff
+    style PB fill:#ffe1e1
+    style NSA fill:#e1f5ff
+    style NSB fill:#ffe1e1
+```
+
+**Isolation Guarantees:**
+- Each project maps to a dedicated Kubernetes namespace (1:1 mapping)
+- User tokens enforce RBAC at namespace boundaries
+- Resources cannot cross namespace boundaries
+- Backend validates project access before CR operations
+
+**Reference:** [Multi-Tenancy Architecture](./multi-tenancy-architecture.md)
+
+---
+
+## Key Architectural Decisions
+
+### 1. Kubernetes-Native Design
+
+**Why:** Leverage Kubernetes for orchestration, scheduling, resource management, and RBAC.
+
+**Benefits:**
+- Declarative resource model via Custom Resources
+- Built-in RBAC and multi-tenancy
+- Horizontal scalability
+- Self-healing and automatic cleanup via OwnerReferences
+
+**Reference:** [ADR-0001: Kubernetes-Native Architecture](../adr/0001-kubernetes-native-architecture.md)
+
+---
+
+### 2. User Token Authentication
+
+**Why:** Enforce per-user RBAC for all API operations instead of using elevated service account permissions.
+
+**Pattern:**
+- Frontend extracts user token from OAuth flow
+- Backend validates token and uses it for K8s API calls
+- Service account only for CR writes and token minting
+
+**Security Benefits:**
+- Audit trail per user
+- Least-privilege access
+- No privilege escalation risks
+
+**Reference:** [ADR-0002: User Token Authentication](../adr/0002-user-token-authentication.md)
+
+---
+
+### 3. Asynchronous Execution Model
+
+**Why:** Long-running AI tasks cannot block HTTP requests.
+
+**Pattern:**
+- **Synchronous:** User request → Backend creates CR → Return immediately
+- **Asynchronous:** Operator watches → Creates Job → Monitors → Updates status
+- **Feedback:** WebSocket or polling for status updates
+
+**Benefits:**
+- Responsive UI (no hanging requests)
+- Resilient to operator/pod restarts
+- Kubernetes handles scheduling and retries
+
+---
+
+### 4. Go Backend + Python Runner
+
+**Why:** Use the best tool for each layer.
+
+**Rationale:**
+- **Go for Backend/Operator:** Performance, K8s client libraries, concurrency
+- **Python for Runner:** Claude SDK, rich AI/ML ecosystem, rapid development
+
+**Reference:** [ADR-0004: Go Backend + Python Runner](../adr/0004-go-backend-python-runner.md)
+
+---
+
+## Component Communication Matrix
+
+| Source | Target | Protocol | Auth | Purpose |
+|--------|--------|----------|------|---------|
+| Frontend | Backend API | HTTPS (REST) | OAuth Token | CRUD operations |
+| Frontend | Backend API | WebSocket | OAuth Token | Real-time updates |
+| Backend API | Kubernetes API | K8s Dynamic Client | User Token | CR operations |
+| Operator | Kubernetes API | K8s Dynamic Client | Service Account | Watch CRs, manage Jobs |
+| Runner Pod | Kubernetes API | K8s Dynamic Client | Pod SA + Minted Token | Update CR status |
+| Operator | Runner Job | - | OwnerReference | Lifecycle management |
+
+---
+
+## Scalability Considerations
+
+### Horizontal Scaling
+
+**Frontend:**
+- Stateless NextJS instances
+- Scale with Kubernetes Deployment replicas
+- Load balancing via Ingress/Route
+
+**Backend API:**
+- Stateless Go instances
+- Scale with Kubernetes Deployment replicas
+- WebSocket sessions require session affinity (sticky sessions)
+
+**Operator:**
+- Single-replica controller (leader election for HA)
+- Watch multiple namespaces concurrently
+- Goroutine per Job for monitoring
+
+**Runner Pods:**
+- One Pod per AgenticSession (isolation)
+- Kubernetes handles scheduling across nodes
+- Resource limits prevent resource exhaustion
+
+### Resource Limits
+
+```yaml
+# Example resource configuration
+resources:
+  requests:
+    memory: "512Mi"
+    cpu: "250m"
+  limits:
+    memory: "2Gi"
+    cpu: "1000m"
+```
+
+**Reference:** [Production Considerations](../../CLAUDE.md#production-considerations)
+
+---
+
+## Related Documentation
+
+- [Agentic Session Lifecycle](./agentic-session-lifecycle.md) - State machine and reconciliation flow
+- [Multi-Tenancy Architecture](./multi-tenancy-architecture.md) - Project isolation and RBAC
+- [Kubernetes Resources](./kubernetes-resources.md) - CRD structures and schemas
+- [Backend Development Standards](../../CLAUDE.md#backend-and-operator-development-standards)
+- [Frontend Development Standards](../../components/frontend/DESIGN_GUIDELINES.md)
diff --git a/docs/architecture/index.md b/docs/architecture/index.md
new file mode 100644
index 000000000..2a1da12b5
--- /dev/null
+++ b/docs/architecture/index.md
@@ -0,0 +1,344 @@
+# Architecture Overview
+
+Welcome to the **Ambient Code Platform Architecture Documentation**. This section provides comprehensive visual diagrams and detailed explanations of the platform's design, components, and patterns.
+
+## Purpose
+
+This architecture documentation helps you:
+
+- **Understand** the platform's component interactions and data flows
+- **Navigate** complex distributed systems with clear visual aids
+- **Make informed decisions** when extending or modifying the platform
+- **Onboard quickly** with structured visual learning
+
+## Navigation Guide
+
+### Core Architecture
+
+Start here to understand the foundational platform architecture:
+
+| Document | Description | Key Diagrams |
+|----------|-------------|--------------|
+| **[Core System Architecture](./core-system-architecture.md)** | 4-component system overview, data flows, and component responsibilities | System architecture, sequence diagrams, multi-tenancy model |
+| **[Agentic Session Lifecycle](./agentic-session-lifecycle.md)** | Session state machine, operator reconciliation, and execution patterns | State diagram, reconciliation flowchart, monitoring loop |
+| **[Multi-Tenancy Architecture](./multi-tenancy-architecture.md)** | Project isolation, RBAC enforcement, and security boundaries | Namespace mapping, authentication flow, permission matrix |
+| **[Kubernetes Resources](./kubernetes-resources.md)** | Custom Resource Definitions (CRDs), schemas, and resource relationships | CR hierarchy, class diagrams, cleanup strategies |
+
+---
+
+## Quick Start by Role
+
+### For Developers
+
+**Start here if you're:**
+- Adding new features to the backend or frontend
+- Debugging session execution issues
+- Understanding component interactions
+
+**Recommended Reading Order:**
+1. [Core System Architecture](./core-system-architecture.md) - Get the big picture
+2. [Agentic Session Lifecycle](./agentic-session-lifecycle.md) - Understand execution flow
+3. [Kubernetes Resources](./kubernetes-resources.md) - Learn CR structures
+
+---
+
+### For Platform Engineers
+
+**Start here if you're:**
+- Deploying the platform to production
+- Setting up multi-tenancy and RBAC
+- Troubleshooting operator issues
+
+**Recommended Reading Order:**
+1. [Core System Architecture](./core-system-architecture.md) - Component overview
+2. [Multi-Tenancy Architecture](./multi-tenancy-architecture.md) - Isolation and security
+3. [Agentic Session Lifecycle](./agentic-session-lifecycle.md) - Operator patterns
+
+---
+
+### For Architects
+
+**Start here if you're:**
+- Evaluating the platform for adoption
+- Planning integrations or extensions
+- Understanding architectural decisions
+
+**Recommended Reading Order:**
+1. [Core System Architecture](./core-system-architecture.md) - Full system design
+2. Review [Architecture Decision Records](../adr/) - Understand "why" behind decisions
+3. [Multi-Tenancy Architecture](./multi-tenancy-architecture.md) - Security model
+4. [Kubernetes Resources](./kubernetes-resources.md) - Resource model
+
+---
+
+## Architectural Principles
+
+The Ambient Code Platform is built on these core principles:
+
+### 1. Kubernetes-Native Design
+
+**Why:** Leverage Kubernetes for orchestration, scheduling, and resource management.
+
+**How:**
+- Custom Resource Definitions (CRDs) for declarative state
+- Operator pattern for reconciliation
+- Built-in RBAC for multi-tenancy
+- OwnerReferences for automatic cleanup
+
+**Reference:** [ADR-0001: Kubernetes-Native Architecture](../adr/0001-kubernetes-native-architecture.md)
+
+---
+
+### 2. User Token Authentication
+
+**Why:** Enforce per-user RBAC instead of using elevated service account permissions.
+
+**How:**
+- Frontend extracts OAuth token
+- Backend validates and uses token for K8s API calls
+- Service account only for specific elevated operations (CR writes, token minting)
+
+**Reference:** [ADR-0002: User Token Authentication](../adr/0002-user-token-authentication.md)
+
+---
+
+### 3. Asynchronous Execution
+
+**Why:** Long-running AI tasks cannot block HTTP requests.
+
+**How:**
+- Synchronous: User request → Backend creates CR → Return immediately
+- Asynchronous: Operator watches → Creates Job → Monitors → Updates status
+- Feedback: WebSocket or polling for status updates
+
+**Benefits:**
+- Responsive UI
+- Resilient to restarts
+- Kubernetes handles scheduling
+
+---
+
+### 4. Multi-Repository Support
+
+**Why:** Real-world automation often requires changes across multiple codebases.
+
+**How:**
+- Sessions can reference multiple Git repositories
+- `mainRepoIndex` specifies working directory
+- Per-repo status tracking (pushed, abandoned, PR URL)
+
+**Reference:** [ADR-0003: Multi-Repository Support](../adr/0003-multi-repo-support.md)
+
+---
+
+### 5. Polyglot Architecture
+
+**Why:** Use the best language for each layer.
+
+**How:**
+- **Go** for backend/operator: Performance, K8s libraries, concurrency
+- **Python** for runner: Claude SDK, AI/ML ecosystem, rapid development
+- **TypeScript/NextJS** for frontend: Modern web development, type safety
+
+**Reference:** [ADR-0004: Go Backend + Python Runner](../adr/0004-go-backend-python-runner.md)
+
+---
+
+## System Components
+
+### Frontend (NextJS + Shadcn UI)
+
+**Purpose:** Web UI for session management and monitoring
+
+**Technology:**
+- NextJS 14+ with App Router
+- Shadcn UI component library
+- React Query for data fetching
+- TypeScript for type safety
+
+**Reference:** [Frontend Development Standards](../../components/frontend/DESIGN_GUIDELINES.md)
+
+---
+
+### Backend API (Go + Gin)
+
+**Purpose:** REST API for CRUD operations on Custom Resources
+
+**Technology:**
+- Go 1.21+
+- Gin web framework
+- Kubernetes Dynamic Client
+- OpenShift OAuth integration
+
+**Key Endpoints:**
+- `/api/projects/:project/agentic-sessions` - Session management
+- `/api/projects/:project/project-settings` - Configuration
+- `/api/projects/:project/rfe-workflows` - RFE orchestration
+- `/ws` - WebSocket for real-time updates
+
+**Reference:** [Backend Development Standards](../../CLAUDE.md#backend-and-operator-development-standards)
+
+---
+
+### Agentic Operator (Go Controller)
+
+**Purpose:** Watch Custom Resources and reconcile state
+
+**Technology:**
+- Go 1.21+
+- Kubernetes controller-runtime patterns
+- Watch/reconciliation loop
+
+**Responsibilities:**
+- Watch AgenticSession, ProjectSettings, RFEWorkflow CRs
+- Create and manage Kubernetes Jobs
+- Monitor Job completion and update CR status
+- Handle timeouts and cleanup
+
+**Reference:** [Operator Development Standards](../../CLAUDE.md#operator-patterns)
+
+---
+
+### Claude Code Runner (Python)
+
+**Purpose:** Execute Claude Code CLI in containerized environment
+
+**Technology:**
+- Python 3.11+
+- Claude Code SDK (≥0.0.23)
+- Anthropic API (≥0.68.0)
+- Git integration
+
+**Responsibilities:**
+- Execute AI-powered automation tasks
+- Manage workspace synchronization
+- Capture results and update CR status
+- Handle interactive and batch modes
+
+**Reference:** [Runner Documentation](../../components/runners/claude-code-runner/README.md)
+
+---
+
+## Data Flow Summary
+
+```mermaid
+graph LR
+    User[User] -->|HTTPS| FE[Frontend]
+    FE -->|REST API| BE[Backend API]
+    BE -->|K8s Dynamic Client| K8s[Kubernetes API]
+
+    K8s -->|CR Created| OP[Operator]
+    OP -->|Creates Job| JOB[Job]
+    JOB -->|Spawns Pod| POD[Runner Pod]
+
+    POD -->|Updates Status| K8s
+    K8s -->|Status Change| BE
+    BE -->|WebSocket| FE
+    FE -->|Display| User
+
+    style User fill:#e1f5ff
+    style FE fill:#fff4e1
+    style BE fill:#ffe1e1
+    style K8s fill:#f0e1ff
+    style OP fill:#e1ffe1
+    style POD fill:#ffe1e1
+```
+
+**High-Level Flow:**
+
+1. **User** interacts with **Frontend** UI
+2. **Frontend** sends API request to **Backend**
+3. **Backend** creates Custom Resource via **Kubernetes API** (using user token)
+4. **Operator** detects CR and creates **Job**
+5. **Job** spawns **Runner Pod** to execute task
+6. **Runner** updates CR status with results
+7. **Backend** sends WebSocket update to **Frontend**
+8. **Frontend** displays results to **User**
+
+**Reference:** [Core System Architecture - Data Flow](./core-system-architecture.md#data-flow-agentic-session-execution)
+
+---
+
+## Architecture Decision Records (ADRs)
+
+ADRs document **why** architectural decisions were made, not just **what** was implemented.
+
+| ADR | Title | Date | Status |
+|-----|-------|------|--------|
+| [0001](../adr/0001-kubernetes-native-architecture.md) | Kubernetes-Native Architecture | 2024-11 | Accepted |
+| [0002](../adr/0002-user-token-authentication.md) | User Token Authentication for API Operations | 2024-11 | Accepted |
+| [0003](../adr/0003-multi-repo-support.md) | Multi-Repository Support in AgenticSessions | 2024-11 | Accepted |
+| [0004](../adr/0004-go-backend-python-runner.md) | Go Backend + Python Runner Technology Stack | 2024-11 | Accepted |
+| [0005](../adr/0005-nextjs-shadcn-react-query.md) | NextJS + Shadcn + React Query Frontend Stack | 2024-11 | Accepted |
+
+**See also:** [Decision Log](../decisions.md) for chronological record of all major decisions.
+
+---
+
+## Design Documents
+
+Detailed design documents for specific features:
+
+| Document | Description |
+|----------|-------------|
+| [Declarative Session Reconciliation](../design/declarative-session-reconciliation.md) | Operator reconciliation patterns |
+| [Session Initialization Flows](../design/session-initialization-flows.md) | Session creation and startup |
+| [Session Status Redesign](../design/session-status-redesign.md) | Status tracking and reporting |
+| [Runner-Operator Contracts](../design/runner-operator-contracts.md) | Communication between runner and operator |
+
+---
+
+## Related Context Files
+
+Loadable context files for specific development tasks:
+
+| Context File | Use When |
+|--------------|----------|
+| [Backend Development](../../.claude/context/backend-development.md) | Working on Go backend or operator |
+| [Frontend Development](../../.claude/context/frontend-development.md) | Working on NextJS frontend |
+| [Security Standards](../../.claude/context/security-standards.md) | Reviewing security practices |
+
+**Reference:** [Repomix Usage Guide](../../.claude/repomix-guide.md) for using architecture views.
+
+---
+
+## Code Pattern Catalog
+
+Common patterns used throughout the codebase:
+
+| Pattern File | Description |
+|--------------|-------------|
+| [Error Handling](../../.claude/patterns/error-handling.md) | Consistent error patterns (backend, operator, runner) |
+| [K8s Client Usage](../../.claude/patterns/k8s-client-usage.md) | When to use user token vs. service account |
+| [React Query Usage](../../.claude/patterns/react-query-usage.md) | Data fetching patterns (queries, mutations, caching) |
+
+---
+
+## Contributing to Architecture Docs
+
+When adding or updating architecture documentation:
+
+1. **Use Mermaid diagrams** for visualizations (compatible with MkDocs and GitHub)
+2. **Follow established patterns** (see existing architecture docs for examples)
+3. **Link to related documentation** (ADRs, design docs, code patterns)
+4. **Update this index** when adding new architecture pages
+5. **Test diagrams** at [mermaid.live](https://mermaid.live) before committing
+
+**Diagram Format Examples:**
+- System architecture → `graph TB` or `graph LR`
+- State transitions → `stateDiagram-v2`
+- Workflows → `sequenceDiagram`
+- Class structures → `classDiagram`
+- Flows → `flowchart`
+
+---
+
+## Questions or Feedback?
+
+For questions about the architecture:
+
+- **Technical questions:** See [Developer Guide](../developer/index.md)
+- **Architecture proposals:** Create an issue with the `architecture` label
+- **Corrections:** Submit a PR with proposed changes
+
+**Repository:** [https://github.com/ambient-code/platform](https://github.com/ambient-code/platform)
diff --git a/docs/architecture/kubernetes-resources.md b/docs/architecture/kubernetes-resources.md
new file mode 100644
index 000000000..c9f1f924f
--- /dev/null
+++ b/docs/architecture/kubernetes-resources.md
@@ -0,0 +1,1042 @@
+# Kubernetes Custom Resources
+
+## Overview
+
+The Ambient Code Platform uses Kubernetes Custom Resource Definitions (CRDs) to represent AI automation tasks and configuration. This document details the structure, lifecycle, and relationships of the three primary CRDs.
+
+## Custom Resource Hierarchy
+
+```mermaid
+graph TB
+    subgraph "Namespace: team-alpha"
+        PS[ProjectSettings<br/>settings<br/>API keys, defaults]
+
+        AS1[AgenticSession<br/>session-1<br/>Batch mode]
+        AS2[AgenticSession<br/>session-2<br/>Interactive mode]
+
+        RFE1[RFEWorkflow<br/>rfe-auth-feature<br/>7-step council]
+
+        Job1[Job<br/>session-1-runner]
+        Job2[Job<br/>session-2-runner]
+
+        Pod1[Pod<br/>session-1-runner-xyz]
+        Pod2[Pod<br/>session-2-runner-abc]
+
+        Secret1[Secret<br/>runner-token-session-1]
+        Secret2[Secret<br/>runner-token-session-2]
+
+        PVC1[PVC<br/>workspace-session-1]
+        PVC2[PVC<br/>workspace-session-2]
+    end
+
+    PS -.->|Referenced by| AS1
+    PS -.->|Referenced by| AS2
+    PS -.->|Referenced by| RFE1
+
+    AS1 -->|OwnerReference<br/>controller=true| Job1
+    AS1 -->|OwnerReference<br/>controller=true| Secret1
+
+    AS2 -->|OwnerReference<br/>controller=true| Job2
+    AS2 -->|OwnerReference<br/>controller=true| Secret2
+
+    Job1 -->|OwnerReference<br/>controller=true| Pod1
+    Job2 -->|OwnerReference<br/>controller=true| Pod2
+
+    Pod1 -.->|Mounts| PVC1
+    Pod2 -.->|Mounts| PVC2
+
+    style PS fill:#ffe1e1
+    style AS1 fill:#e1f5ff
+    style AS2 fill:#e1f5ff
+    style RFE1 fill:#fff4e1
+    style Job1 fill:#f0e1ff
+    style Job2 fill:#f0e1ff
+```
+
+**Legend:**
+- Solid arrows (→): OwnerReference (parent → child)
+- Dashed arrows (-.->): Reference or mount (not ownership)
+
+---
+
+## AgenticSession Custom Resource
+
+### Purpose
+
+Represents a single AI-powered automation task executed via Claude Code.
+
+### API Definition
+
+**Group:** `vteam.ambient-code`
+**Version:** `v1alpha1`
+**Kind:** `AgenticSession`
+**Plural:** `agenticsessions`
+**Shortname:** `as`
+
+### Resource Structure
+
+```mermaid
+classDiagram
+    class AgenticSession {
+        +metadata ObjectMeta
+        +spec AgenticSessionSpec
+        +status AgenticSessionStatus
+    }
+
+    class AgenticSessionSpec {
+        +prompt string
+        +repos []RepoConfig
+        +mainRepoIndex int
+        +interactive bool
+        +timeout int
+        +model string
+        +anthropicApiKeySecret string
+    }
+
+    class RepoConfig {
+        +input RepoInput
+        +output RepoOutput
+    }
+
+    class RepoInput {
+        +url string
+        +branch string
+        +authSecret string
+    }
+
+    class RepoOutput {
+        +forkRepo string
+        +targetBranch string
+        +createPR bool
+    }
+
+    class AgenticSessionStatus {
+        +phase string
+        +startTime string
+        +completionTime string
+        +results string
+        +message string
+        +repos []RepoStatus
+    }
+
+    class RepoStatus {
+        +index int
+        +pushed bool
+        +prUrl string
+        +error string
+    }
+
+    AgenticSession --> AgenticSessionSpec
+    AgenticSession --> AgenticSessionStatus
+    AgenticSessionSpec --> RepoConfig
+    RepoConfig --> RepoInput
+    RepoConfig --> RepoOutput
+    AgenticSessionStatus --> RepoStatus
+```
+
+### Spec Fields
+
+#### `spec.prompt` (required)
+
+**Type:** `string`
+
+**Description:** The instruction or task for Claude Code to execute.
+
+**Examples:**
+```yaml
+prompt: "Add unit tests for the authentication module"
+```
+
+```yaml
+prompt: "Refactor the database connection logic to use connection pooling"
+```
+
+---
+
+#### `spec.repos` (required)
+
+**Type:** `[]RepoConfig`
+
+**Description:** Array of Git repositories to operate on. At least one repo required.
+
+**Structure:**
+
+```yaml
+repos:
+  - input:
+      url: "https://github.com/org/backend"
+      branch: "main"
+      authSecret: "git-credentials"  # optional
+    output:
+      forkRepo: "https://github.com/user/backend"  # optional
+      targetBranch: "feature/auth-refactor"         # optional
+      createPR: true                                # optional
+```
+
+**Fields:**
+
+- **`input.url`** (required): Git repository URL (HTTPS or SSH)
+- **`input.branch`** (required): Branch to clone and work on
+- **`input.authSecret`** (optional): Secret name containing Git credentials
+- **`output.forkRepo`** (optional): Fork repository URL for pushing changes
+- **`output.targetBranch`** (optional): Target branch for PR creation
+- **`output.createPR`** (optional): Whether to create PR after pushing
+
+**Reference:** [ADR-0003: Multi-Repository Support](../adr/0003-multi-repo-support.md)
+
+---
+
+#### `spec.mainRepoIndex` (optional)
+
+**Type:** `int`
+
+**Description:** Index of the repository to use as Claude Code's working directory.
+
+**Default:** `0` (first repository)
+
+**Example:**
+
+```yaml
+repos:
+  - input:
+      url: "https://github.com/org/shared-lib"
+      branch: "main"
+  - input:
+      url: "https://github.com/org/api-service"
+      branch: "develop"
+mainRepoIndex: 1  # Work in api-service repo
+```
+
+---
+
+#### `spec.interactive` (optional)
+
+**Type:** `bool`
+
+**Description:** Enable interactive mode for multi-turn conversations.
+
+**Default:** `false` (batch mode)
+
+**Interactive Mode:**
+- Pod continues running after initial execution
+- User sends messages via inbox file (`/workspace/inbox.txt`)
+- Runner responds via outbox file (`/workspace/outbox.txt`)
+- No timeout enforced
+
+**Example:**
+
+```yaml
+interactive: true
+prompt: "Help me debug the authentication flow"
+```
+
+---
+
+#### `spec.timeout` (optional)
+
+**Type:** `int`
+
+**Description:** Timeout in seconds for batch mode execution.
+
+**Default:** Uses ProjectSettings default or 3600 (1 hour)
+
+**Ignored in interactive mode**
+
+**Example:**
+
+```yaml
+timeout: 7200  # 2 hours
+```
+
+---
+
+#### `spec.model` (optional)
+
+**Type:** `string`
+
+**Description:** Claude model to use for execution.
+
+**Default:** Uses ProjectSettings default or `claude-sonnet-4-5`
+
+**Valid Values:**
+- `claude-opus-4-5`
+- `claude-sonnet-4-5`
+- `claude-haiku-4`
+
+**Example:**
+
+```yaml
+model: "claude-opus-4-5"  # Use most capable model
+```
+
+---
+
+#### `spec.anthropicApiKeySecret` (optional)
+
+**Type:** `string`
+
+**Description:** Secret name containing Anthropic API key.
+
+**Default:** Uses ProjectSettings default
+
+**Secret Format:**
+
+```yaml
+apiVersion: v1
+kind: Secret
+metadata:
+  name: anthropic-api-key
+type: Opaque
+stringData:
+  ANTHROPIC_API_KEY: sk-ant-...
+```
+
+---
+
+### Status Fields
+
+#### `status.phase` (set by operator)
+
+**Type:** `string`
+
+**Description:** Current phase of session execution.
+
+**Valid Values:**
+- `Pending` - CR created, waiting for operator to create Job
+- `Running` - Job created, pod executing
+- `Completed` - Execution succeeded
+- `Failed` - Execution failed
+- `Timeout` - Execution exceeded timeout
+
+**Reference:** [Agentic Session Lifecycle](./agentic-session-lifecycle.md)
+
+---
+
+#### `status.startTime` (set by operator)
+
+**Type:** `string` (RFC3339 timestamp)
+
+**Description:** When execution started (Job created).
+
+**Example:** `"2025-12-08T14:30:00Z"`
+
+---
+
+#### `status.completionTime` (set by operator/runner)
+
+**Type:** `string` (RFC3339 timestamp)
+
+**Description:** When execution completed (success, failure, or timeout).
+
+**Example:** `"2025-12-08T15:45:00Z"`
+
+---
+
+#### `status.results` (set by runner)
+
+**Type:** `string`
+
+**Description:** Execution results, logs, or output from Claude Code.
+
+**May contain:**
+- Generated code snippets
+- File paths modified
+- Test results
+- Error messages
+- Partial results (if timeout/failure)
+
+---
+
+#### `status.message` (set by operator/runner)
+
+**Type:** `string`
+
+**Description:** Human-readable status message (especially for errors).
+
+**Examples:**
+- `"Execution completed successfully"`
+- `"Failed to authenticate with Anthropic API"`
+- `"Exceeded timeout of 3600 seconds"`
+- `"Git repository not found"`
+
+---
+
+#### `status.repos` (set by runner)
+
+**Type:** `[]RepoStatus`
+
+**Description:** Per-repository status tracking.
+
+**Structure:**
+
+```yaml
+status:
+  repos:
+    - index: 0
+      pushed: true
+      prUrl: "https://github.com/org/backend/pulls/123"
+    - index: 1
+      pushed: false
+      error: "No changes to push"
+```
+
+**Fields:**
+
+- **`index`**: Corresponds to `spec.repos[index]`
+- **`pushed`**: Whether changes were pushed to remote
+- **`prUrl`**: Pull request URL (if created)
+- **`error`**: Error message (if push/PR creation failed)
+
+---
+
+### Complete Example
+
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: AgenticSession
+metadata:
+  name: add-auth-tests
+  namespace: team-alpha
+  labels:
+    project: backend-api
+    type: testing
+spec:
+  prompt: |
+    Add comprehensive unit tests for the authentication module.
+    Ensure coverage of:
+    - Login/logout flows
+    - Token validation
+    - Password reset
+    - Edge cases (expired tokens, invalid credentials)
+
+  repos:
+    - input:
+        url: "https://github.com/org/backend-api"
+        branch: "develop"
+        authSecret: "github-pat"
+      output:
+        forkRepo: "https://github.com/user/backend-api"
+        targetBranch: "feature/auth-tests"
+        createPR: true
+
+  mainRepoIndex: 0
+  interactive: false
+  timeout: 3600
+  model: "claude-sonnet-4-5"
+  anthropicApiKeySecret: "anthropic-api-key"
+
+status:
+  phase: "Completed"
+  startTime: "2025-12-08T14:30:00Z"
+  completionTime: "2025-12-08T14:52:30Z"
+  results: |
+    Successfully added unit tests:
+    - tests/auth/test_login.py (12 tests)
+    - tests/auth/test_token_validation.py (8 tests)
+    - tests/auth/test_password_reset.py (6 tests)
+
+    Coverage increased from 68% to 89% for auth module.
+
+  message: "Execution completed successfully"
+
+  repos:
+    - index: 0
+      pushed: true
+      prUrl: "https://github.com/org/backend-api/pulls/456"
+```
+
+---
+
+## ProjectSettings Custom Resource
+
+### Purpose
+
+Stores project-wide configuration such as default models, API keys, and timeout settings.
+
+### API Definition
+
+**Group:** `vteam.ambient-code`
+**Version:** `v1alpha1`
+**Kind:** `ProjectSettings`
+**Plural:** `projectsettings`
+**Shortname:** `ps`
+
+### Resource Structure
+
+```mermaid
+classDiagram
+    class ProjectSettings {
+        +metadata ObjectMeta
+        +spec ProjectSettingsSpec
+    }
+
+    class ProjectSettingsSpec {
+        +defaultModel string
+        +defaultTimeout int
+        +anthropicApiKeySecret string
+        +gitCredentialsSecret string
+        +enableAutoCleanup bool
+        +retentionDays int
+    }
+
+    ProjectSettings --> ProjectSettingsSpec
+```
+
+### Spec Fields
+
+#### `spec.defaultModel` (optional)
+
+**Type:** `string`
+
+**Description:** Default Claude model for sessions without explicit `model` field.
+
+**Default:** `claude-sonnet-4-5`
+
+**Example:**
+
+```yaml
+defaultModel: "claude-opus-4-5"  # Use most capable model by default
+```
+
+---
+
+#### `spec.defaultTimeout` (optional)
+
+**Type:** `int`
+
+**Description:** Default timeout (seconds) for batch mode sessions.
+
+**Default:** `3600` (1 hour)
+
+**Example:**
+
+```yaml
+defaultTimeout: 7200  # 2 hours for complex tasks
+```
+
+---
+
+#### `spec.anthropicApiKeySecret` (optional)
+
+**Type:** `string`
+
+**Description:** Default Secret name for Anthropic API key.
+
+**Sessions without explicit `anthropicApiKeySecret` use this default.**
+
+**Example:**
+
+```yaml
+anthropicApiKeySecret: "anthropic-api-key"
+```
+
+---
+
+#### `spec.gitCredentialsSecret` (optional)
+
+**Type:** `string`
+
+**Description:** Default Secret name for Git authentication.
+
+**Sessions without explicit `authSecret` in repo config use this default.**
+
+**Example:**
+
+```yaml
+gitCredentialsSecret: "github-pat"
+```
+
+---
+
+#### `spec.enableAutoCleanup` (optional)
+
+**Type:** `bool`
+
+**Description:** Enable automatic cleanup of completed sessions.
+
+**Default:** `false`
+
+**Example:**
+
+```yaml
+enableAutoCleanup: true
+retentionDays: 7  # Delete completed sessions after 7 days
+```
+
+---
+
+#### `spec.retentionDays` (optional)
+
+**Type:** `int`
+
+**Description:** Days to retain completed sessions before auto-cleanup.
+
+**Default:** `7`
+
+**Only applies if `enableAutoCleanup: true`**
+
+---
+
+### Complete Example
+
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: ProjectSettings
+metadata:
+  name: settings
+  namespace: team-alpha
+spec:
+  defaultModel: "claude-sonnet-4-5"
+  defaultTimeout: 5400  # 90 minutes
+  anthropicApiKeySecret: "anthropic-api-key"
+  gitCredentialsSecret: "github-pat"
+  enableAutoCleanup: true
+  retentionDays: 14
+```
+
+---
+
+## RFEWorkflow Custom Resource
+
+### Purpose
+
+Orchestrates a 7-step agent council process for Request For Enhancement (RFE) refinement.
+
+### API Definition
+
+**Group:** `vteam.ambient-code`
+**Version:** `v1alpha1`
+**Kind:** `RFEWorkflow`
+**Plural:** `rfeworkflows`
+**Shortname:** `rfe`
+
+### Resource Structure
+
+```mermaid
+classDiagram
+    class RFEWorkflow {
+        +metadata ObjectMeta
+        +spec RFEWorkflowSpec
+        +status RFEWorkflowStatus
+    }
+
+    class RFEWorkflowSpec {
+        +request string
+        +context string
+        +repos []RepoConfig
+        +stepTimeout int
+    }
+
+    class RFEWorkflowStatus {
+        +phase string
+        +currentStep int
+        +steps []StepStatus
+        +finalRFE string
+        +startTime string
+        +completionTime string
+    }
+
+    class StepStatus {
+        +stepNumber int
+        +agent string
+        +status string
+        +output string
+        +startTime string
+        +completionTime string
+    }
+
+    RFEWorkflow --> RFEWorkflowSpec
+    RFEWorkflow --> RFEWorkflowStatus
+    RFEWorkflowStatus --> StepStatus
+```
+
+### 7-Step Agent Council
+
+```mermaid
+flowchart LR
+    Request[User Request] --> Step1
+
+    Step1[Step 1:<br/>Product Manager<br/>Requirements clarification] --> Step2
+    Step2[Step 2:<br/>Solution Architect<br/>Technical design] --> Step3
+    Step3[Step 3:<br/>Staff Engineer<br/>Implementation plan] --> Step4
+    Step4[Step 4:<br/>Product Owner<br/>Acceptance criteria] --> Step5
+    Step5[Step 5:<br/>Team Lead<br/>Task breakdown] --> Step6
+    Step6[Step 6:<br/>Team Member<br/>Effort estimation] --> Step7
+    Step7[Step 7:<br/>Delivery Owner<br/>Risk assessment] --> Final
+
+    Final[Final RFE Document]
+
+    style Request fill:#e1f5ff
+    style Final fill:#e1ffe1
+    style Step1 fill:#ffe1e1
+    style Step2 fill:#fff4e1
+    style Step3 fill:#f0e1ff
+    style Step4 fill:#ffe1e1
+    style Step5 fill:#fff4e1
+    style Step6 fill:#f0e1ff
+    style Step7 fill:#ffe1e1
+```
+
+**Agent Roles:**
+
+1. **Product Manager:** Clarifies requirements, defines user stories
+2. **Solution Architect:** Designs technical architecture, identifies dependencies
+3. **Staff Engineer:** Creates implementation plan, reviews code patterns
+4. **Product Owner:** Defines acceptance criteria and success metrics
+5. **Team Lead:** Breaks down into tasks, assigns priorities
+6. **Team Member:** Estimates effort, identifies blockers
+7. **Delivery Owner:** Assesses risks, creates rollback plan
+
+---
+
+### Spec Fields
+
+#### `spec.request` (required)
+
+**Type:** `string`
+
+**Description:** Initial RFE request or feature description.
+
+**Example:**
+
+```yaml
+request: |
+  Add support for OAuth2 authentication in the API.
+  Users should be able to authenticate using Google, GitHub, and Microsoft accounts.
+```
+
+---
+
+#### `spec.context` (optional)
+
+**Type:** `string`
+
+**Description:** Additional context for the council (codebase state, constraints, preferences).
+
+**Example:**
+
+```yaml
+context: |
+  - Existing authentication uses JWT tokens
+  - Frontend is React-based
+  - Backend is Go + Gin framework
+  - Prefer minimal dependencies
+```
+
+---
+
+#### `spec.repos` (required)
+
+**Type:** `[]RepoConfig`
+
+**Description:** Repositories for council to analyze (same structure as AgenticSession).
+
+---
+
+#### `spec.stepTimeout` (optional)
+
+**Type:** `int`
+
+**Description:** Timeout (seconds) per step.
+
+**Default:** `1800` (30 minutes)
+
+---
+
+### Status Fields
+
+#### `status.phase` (set by operator)
+
+**Type:** `string`
+
+**Valid Values:**
+- `Pending` - Workflow created, not started
+- `Running` - Executing steps
+- `Completed` - All steps completed
+- `Failed` - One or more steps failed
+
+---
+
+#### `status.currentStep` (set by operator)
+
+**Type:** `int`
+
+**Description:** Currently executing step (1-7).
+
+---
+
+#### `status.steps` (set by operator/runner)
+
+**Type:** `[]StepStatus`
+
+**Description:** Status for each of the 7 steps.
+
+**Fields:**
+
+- **`stepNumber`**: 1-7
+- **`agent`**: Agent role (e.g., "Product Manager")
+- **`status`**: `Pending`, `Running`, `Completed`, `Failed`
+- **`output`**: Agent's output for this step
+- **`startTime`**: RFC3339 timestamp
+- **`completionTime`**: RFC3339 timestamp
+
+---
+
+#### `status.finalRFE` (set by runner)
+
+**Type:** `string`
+
+**Description:** Final synthesized RFE document combining all agent outputs.
+
+---
+
+### Complete Example
+
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: RFEWorkflow
+metadata:
+  name: oauth-authentication
+  namespace: team-alpha
+spec:
+  request: |
+    Add OAuth2 authentication to the API supporting Google, GitHub, and Microsoft.
+
+  context: |
+    - Current auth uses JWT tokens
+    - Backend: Go + Gin
+    - Frontend: React + NextJS
+
+  repos:
+    - input:
+        url: "https://github.com/org/backend-api"
+        branch: "develop"
+    - input:
+        url: "https://github.com/org/frontend"
+        branch: "develop"
+
+  stepTimeout: 1800
+
+status:
+  phase: "Completed"
+  currentStep: 7
+
+  steps:
+    - stepNumber: 1
+      agent: "Product Manager"
+      status: "Completed"
+      output: |
+        Requirements clarified:
+        - Support 3 OAuth providers
+        - Fallback to JWT for API clients
+        - User profile sync on first login
+      startTime: "2025-12-08T10:00:00Z"
+      completionTime: "2025-12-08T10:15:00Z"
+
+    - stepNumber: 2
+      agent: "Solution Architect"
+      status: "Completed"
+      output: |
+        Technical design:
+        - Use golang.org/x/oauth2 library
+        - Add OAuthProvider table (Postgres)
+        - Extend User model with provider_id field
+        - Create /auth/oauth/{provider} endpoints
+      startTime: "2025-12-08T10:15:00Z"
+      completionTime: "2025-12-08T10:35:00Z"
+
+    # ... (steps 3-7)
+
+  finalRFE: |
+    # RFE: OAuth2 Authentication
+
+    ## Overview
+    Add OAuth2 authentication supporting Google, GitHub, and Microsoft.
+
+    ## Requirements
+    - Support 3 OAuth providers
+    - Fallback to JWT for API clients
+    - User profile sync on first login
+
+    ## Technical Design
+    - Use golang.org/x/oauth2 library
+    - Add OAuthProvider table
+    - Extend User model
+    - Create /auth/oauth/{provider} endpoints
+
+    ## Implementation Plan
+    (Detailed steps from Staff Engineer)
+
+    ## Acceptance Criteria
+    (Criteria from Product Owner)
+
+    ## Task Breakdown
+    (Tasks from Team Lead)
+
+    ## Effort Estimation
+    (Estimates from Team Member)
+
+    ## Risk Assessment
+    (Risks and mitigation from Delivery Owner)
+
+  startTime: "2025-12-08T10:00:00Z"
+  completionTime: "2025-12-08T13:45:00Z"
+```
+
+---
+
+## OwnerReferences and Cleanup
+
+### OwnerReference Pattern
+
+**Purpose:** Automatic resource cleanup when parent is deleted.
+
+**Structure:**
+
+```yaml
+apiVersion: vteam.ambient-code/v1alpha1
+kind: AgenticSession
+metadata:
+  name: session-1
+  namespace: team-alpha
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: session-1-runner
+  namespace: team-alpha
+  ownerReferences:
+    - apiVersion: vteam.ambient-code/v1alpha1
+      kind: AgenticSession
+      name: session-1
+      uid: a1b2c3d4-e5f6-7890-abcd-ef1234567890
+      controller: true
+      # blockOwnerDeletion: false (default, do not set to true)
+```
+
+**Key Fields:**
+
+- **`controller: true`**: Only ONE owner can be controller (primary parent)
+- **`blockOwnerDeletion`**: **Omit this field** (causes permission issues in multi-tenant)
+
+**Cleanup Behavior:**
+
+1. User deletes AgenticSession CR
+2. Kubernetes cascades delete to owned resources:
+   - Job (which cascades to Pod)
+   - Secret (runner token)
+   - PVC (workspace, if configured)
+
+**Reference:** [Backend/Operator Standards - OwnerReferences](../../CLAUDE.md#ownerreferences-pattern)
+
+---
+
+### Cleanup Strategies
+
+#### Automatic (OwnerReferences)
+
+**When:** Parent CR deleted
+
+**How:** Kubernetes garbage collector cascades delete
+
+**Pros:**
+- No manual cleanup required
+- Consistent behavior
+- Works even if operator is down
+
+**Cons:**
+- Deletion order not controllable
+- All child resources deleted (no selective retention)
+
+---
+
+#### Manual (Operator Cleanup)
+
+**When:** Session completes successfully
+
+**How:** Operator explicitly deletes Job (Pod cleaned by Job controller)
+
+**Pattern:**
+
+```go
+func cleanupCompletedSession(namespace, jobName string) {
+    policy := v1.DeletePropagationBackground
+
+    err := K8sClient.BatchV1().Jobs(namespace).Delete(
+        context.Background(), jobName, v1.DeleteOptions{
+            PropagationPolicy: &policy,
+        })
+
+    if err != nil && !errors.IsNotFound(err) {
+        log.Printf("Failed to delete job: %v", err)
+    }
+}
+```
+
+**Pros:**
+- Immediate cleanup on completion
+- Selective retention (e.g., keep PVC, delete Job)
+
+**Cons:**
+- Requires operator to be running
+- More complex logic
+
+---
+
+#### Time-Based (TTL)
+
+**When:** ProjectSettings enables `enableAutoCleanup`
+
+**How:** Operator periodically deletes old completed CRs
+
+**Pattern:**
+
+```go
+func cleanupOldSessions(namespace string, retentionDays int) {
+    cutoff := time.Now().AddDate(0, 0, -retentionDays)
+
+    list, _ := DynamicClient.Resource(gvr).Namespace(namespace).List(
+        context.Background(), v1.ListOptions{})
+
+    for _, item := range list.Items {
+        phase, _, _ := unstructured.NestedString(item.Object, "status", "phase")
+        if phase != "Completed" && phase != "Failed" {
+            continue  // Only cleanup terminal states
+        }
+
+        completionTime, _, _ := unstructured.NestedString(item.Object, "status", "completionTime")
+        if completionTime == "" {
+            continue
+        }
+
+        t, err := time.Parse(time.RFC3339, completionTime)
+        if err != nil || t.After(cutoff) {
+            continue  // Too recent or invalid timestamp
+        }
+
+        // Delete old completed session
+        DynamicClient.Resource(gvr).Namespace(namespace).Delete(
+            context.Background(), item.GetName(), v1.DeleteOptions{})
+
+        log.Printf("Deleted old session %s (completed %s)", item.GetName(), completionTime)
+    }
+}
+```
+
+**Pros:**
+- Automatic space management
+- Configurable retention period
+
+**Cons:**
+- Loses audit trail (consider archiving first)
+- Requires periodic operator execution
+
+---
+
+## Related Documentation
+
+- [Core System Architecture](./core-system-architecture.md) - Component overview
+- [Agentic Session Lifecycle](./agentic-session-lifecycle.md) - Session state machine
+- [Multi-Tenancy Architecture](./multi-tenancy-architecture.md) - Namespace isolation
+- [ADR-0001: Kubernetes-Native Architecture](../adr/0001-kubernetes-native-architecture.md)
+- [ADR-0003: Multi-Repository Support](../adr/0003-multi-repo-support.md)
diff --git a/docs/architecture/multi-tenancy-architecture.md b/docs/architecture/multi-tenancy-architecture.md
new file mode 100644
index 000000000..ebdf86415
--- /dev/null
+++ b/docs/architecture/multi-tenancy-architecture.md
@@ -0,0 +1,756 @@
+# Multi-Tenancy Architecture
+
+## Overview
+
+The Ambient Code Platform implements **namespace-based multi-tenancy** where each project maps to a dedicated Kubernetes namespace. This ensures complete isolation between tenants while leveraging Kubernetes RBAC for fine-grained access control.
+
+## Project-to-Namespace Mapping
+
+```mermaid
+graph TB
+    subgraph "Frontend Layer"
+        UI[User Interface<br/>Project Selection]
+    end
+
+    subgraph "Backend API Layer"
+        API[Backend API<br/>Project Context Validation]
+        MW[Middleware:<br/>ValidateProjectContext]
+    end
+
+    subgraph "Kubernetes Cluster"
+        subgraph "Project: team-alpha"
+            NSA[Namespace: team-alpha]
+            RBA[RoleBinding: team-alpha-users]
+            ASA1[AgenticSession: session-1]
+            ASA2[AgenticSession: session-2]
+            PSA[ProjectSettings: settings]
+            PVC_A[PVC: workspace-session-1]
+        end
+
+        subgraph "Project: team-beta"
+            NSB[Namespace: team-beta]
+            RBB[RoleBinding: team-beta-users]
+            ASB1[AgenticSession: session-1]
+            PSB[ProjectSettings: settings]
+            PVC_B[PVC: workspace-session-1]
+        end
+
+        subgraph "Project: team-gamma"
+            NSC[Namespace: team-gamma]
+            RBC[RoleBinding: team-gamma-users]
+            ASC1[AgenticSession: session-1]
+            PSC[ProjectSettings: settings]
+        end
+    end
+
+    UI -->|GET /api/projects| API
+    API -->|List namespaces<br/>user has access to| NSA
+    API -->|List namespaces<br/>user has access to| NSB
+    API -->|List namespaces<br/>user has access to| NSC
+
+    UI -->|POST /api/projects/team-alpha/agentic-sessions| MW
+    MW -->|Validate RBAC| RBA
+    MW -->|Create CR| ASA1
+
+    style NSA fill:#e1f5ff
+    style NSB fill:#ffe1e1
+    style NSC fill:#e1ffe1
+    style MW fill:#fff4e1
+    style RBA fill:#f0e1ff
+    style RBB fill:#f0e1ff
+    style RBC fill:#f0e1ff
+```
+
+**Key Principles:**
+
+1. **1:1 Mapping:** Each project corresponds to exactly one Kubernetes namespace
+2. **Namespace = Isolation Boundary:** Resources cannot cross namespace boundaries
+3. **Project Name = Namespace Name:** Simplifies mapping and debugging
+4. **RBAC Enforced:** User must have permissions on namespace to access project
+
+---
+
+## User Authentication Flow
+
+```mermaid
+sequenceDiagram
+    actor User
+    participant Browser
+    participant OAuth as OAuth Proxy<br/>(OpenShift OAuth)
+    participant FE as Frontend
+    participant BE as Backend API
+    participant K8s as Kubernetes API
+
+    User->>Browser: Access platform URL
+    Browser->>OAuth: Request (no token)
+
+    Note over OAuth: User not authenticated
+
+    OAuth->>User: Redirect to OpenShift login
+    User->>OAuth: Provide credentials
+    OAuth->>OAuth: Validate credentials<br/>Generate OAuth token
+
+    OAuth->>Browser: Set token in cookie/header
+    Browser->>FE: Load frontend app<br/>(with token)
+
+    Note over FE: Token stored in memory/cookie
+
+    FE->>BE: API request<br/>Authorization: Bearer {token}
+
+    Note over BE: Extract token from header<br/>X-Forwarded-User from OAuth proxy
+
+    BE->>BE: Validate token format
+
+    BE->>K8s: Create K8s client<br/>with user token
+
+    K8s-->>BE: Client configured
+
+    BE->>K8s: Perform operation<br/>(e.g., List AgenticSessions)
+
+    Note over K8s: Kubernetes validates token<br/>Checks RBAC permissions
+
+    alt User has permissions
+        K8s-->>BE: Resources returned
+        BE-->>FE: 200 OK + data
+    else User lacks permissions
+        K8s-->>BE: 403 Forbidden
+        BE-->>FE: 403 Forbidden
+    end
+
+    FE-->>Browser: Display result
+    Browser-->>User: Show UI
+```
+
+**Authentication Components:**
+
+1. **OAuth Proxy:** Intercepts requests, enforces authentication, injects X-Forwarded-User header
+2. **Frontend:** Receives token, includes in all API requests
+3. **Backend:** Extracts token, creates K8s client with user credentials
+4. **Kubernetes API:** Validates token against ServiceAccount/User, enforces RBAC
+
+**Reference:** [ADR-0002: User Token Authentication](../adr/0002-user-token-authentication.md)
+
+---
+
+## RBAC Model
+
+### Role Hierarchy
+
+```mermaid
+graph TB
+    subgraph "Cluster Roles (Platform Admin)"
+        CA[ClusterRole:<br/>cluster-admin]
+        CVR[ClusterRole:<br/>vteam-view-all]
+    end
+
+    subgraph "Namespace Roles (Project Team)"
+        NA[Role:<br/>vteam-admin<br/>(CRUD all resources)]
+        NE[Role:<br/>vteam-editor<br/>(CRUD sessions)]
+        NV[Role:<br/>vteam-viewer<br/>(Read-only)]
+    end
+
+    subgraph "Service Accounts"
+        SAB[ServiceAccount:<br/>backend<br/>(CR writes, token minting)]
+        SAO[ServiceAccount:<br/>operator<br/>(Watch CRs, manage Jobs)]
+        SAR[ServiceAccount:<br/>runner<br/>(Update CR status)]
+    end
+
+    CA -->|Has all permissions| NA
+    CA -->|Has all permissions| NE
+    CA -->|Has all permissions| NV
+
+    CVR -->|Can read| NV
+
+    NA -->|Includes| NE
+    NE -->|Includes| NV
+
+    SAB -->|Bound to| NA
+    SAO -->|Bound to| NA
+    SAR -->|Bound to| NV
+
+    style CA fill:#ffe1e1
+    style CVR fill:#ffe1e1
+    style NA fill:#e1f5ff
+    style NE fill:#fff4e1
+    style NV fill:#e1ffe1
+    style SAB fill:#f0e1ff
+    style SAO fill:#f0e1ff
+    style SAR fill:#f0e1ff
+```
+
+### Permission Matrix
+
+| Resource | vteam-viewer | vteam-editor | vteam-admin | backend SA | operator SA |
+|----------|--------------|--------------|-------------|------------|-------------|
+| **AgenticSession** |
+| list | ✓ | ✓ | ✓ | ✓ | ✓ |
+| get | ✓ | ✓ | ✓ | ✓ | ✓ |
+| watch | - | - | - | - | ✓ |
+| create | - | ✓ | ✓ | ✓ | - |
+| update | - | ✓ | ✓ | ✓ | - |
+| update/status | - | - | - | ✓ | ✓ |
+| delete | - | ✓ | ✓ | ✓ | - |
+| **ProjectSettings** |
+| list | ✓ | ✓ | ✓ | ✓ | ✓ |
+| get | ✓ | ✓ | ✓ | ✓ | ✓ |
+| create | - | - | ✓ | ✓ | - |
+| update | - | - | ✓ | ✓ | - |
+| delete | - | - | ✓ | ✓ | - |
+| **RFEWorkflow** |
+| list | ✓ | ✓ | ✓ | ✓ | ✓ |
+| get | ✓ | ✓ | ✓ | ✓ | ✓ |
+| create | - | ✓ | ✓ | ✓ | - |
+| update | - | ✓ | ✓ | ✓ | - |
+| delete | - | ✓ | ✓ | ✓ | - |
+| **Jobs** |
+| list | ✓ | ✓ | ✓ | - | ✓ |
+| get | ✓ | ✓ | ✓ | - | ✓ |
+| create | - | - | - | - | ✓ |
+| delete | - | - | ✓ | - | ✓ |
+| **Secrets** |
+| list | - | - | ✓ | ✓ | ✓ |
+| get | - | - | ✓ | ✓ | ✓ |
+| create | - | - | - | ✓ | ✓ |
+| delete | - | - | ✓ | ✓ | ✓ |
+
+**Legend:**
+- ✓ = Permission granted
+- \- = Permission denied
+
+---
+
+## Backend API Authorization Pattern
+
+### Middleware Chain
+
+```mermaid
+flowchart LR
+    Req[HTTP Request] --> Recovery[gin.Recovery]
+    Recovery --> Logger[gin.Logger<br/>Token redaction]
+    Logger --> CORS[CORS<br/>middleware]
+    CORS --> Identity[forwardedIdentityMiddleware<br/>Extract X-Forwarded-User]
+    Identity --> Validate[ValidateProjectContext<br/>RBAC check]
+    Validate --> Handler[Route Handler<br/>Business logic]
+
+    style Req fill:#e1f5ff
+    style Validate fill:#fff4e1
+    style Handler fill:#e1ffe1
+```
+
+### User Token Extraction
+
+**Backend Pattern** (`components/backend/handlers/helpers.go`):
+
+```go
+// GetK8sClientsForRequest creates K8s clients using user token from request
+func GetK8sClientsForRequest(c *gin.Context) (*kubernetes.Clientset, dynamic.Interface) {
+    // 1. Extract Authorization header
+    rawAuth := c.GetHeader("Authorization")
+    if rawAuth == "" {
+        log.Printf("Missing Authorization header")
+        return nil, nil
+    }
+
+    // 2. Parse Bearer token
+    parts := strings.SplitN(rawAuth, " ", 2)
+    if len(parts) != 2 || !strings.EqualFold(parts[0], "Bearer") {
+        log.Printf("Invalid Authorization header format")
+        return nil, nil
+    }
+
+    token := strings.TrimSpace(parts[1])
+    if token == "" {
+        log.Printf("Empty token")
+        return nil, nil
+    }
+
+    log.Printf("Creating K8s client with user token (len=%d)", len(token))
+
+    // 3. Create K8s client with user token
+    config := &rest.Config{
+        Host:        os.Getenv("KUBERNETES_SERVICE_HOST"),
+        BearerToken: token,
+        TLSClientConfig: rest.TLSClientConfig{
+            Insecure: false,
+            CAFile:   "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt",
+        },
+    }
+
+    k8sClient, err := kubernetes.NewForConfig(config)
+    if err != nil {
+        log.Printf("Failed to create K8s client: %v", err)
+        return nil, nil
+    }
+
+    dynClient, err := dynamic.NewForConfig(config)
+    if err != nil {
+        log.Printf("Failed to create dynamic client: %v", err)
+        return nil, nil
+    }
+
+    return k8sClient, dynClient
+}
+```
+
+### RBAC Validation Middleware
+
+**Pattern** (`components/backend/handlers/middleware.go`):
+
+```go
+func ValidateProjectContext() gin.HandlerFunc {
+    return func(c *gin.Context) {
+        projectName := c.Param("projectName")
+        if projectName == "" {
+            c.JSON(http.StatusBadRequest, gin.H{"error": "Missing project name"})
+            c.Abort()
+            return
+        }
+
+        // Get user-scoped K8s client
+        reqK8s, _ := GetK8sClientsForRequest(c)
+        if reqK8s == nil {
+            c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid or missing token"})
+            c.Abort()
+            return
+        }
+
+        // Check if user has access to namespace
+        ssar := &authv1.SelfSubjectAccessReview{
+            Spec: authv1.SelfSubjectAccessReviewSpec{
+                ResourceAttributes: &authv1.ResourceAttributes{
+                    Group:     "vteam.ambient-code",
+                    Resource:  "agenticsessions",
+                    Verb:      "list",
+                    Namespace: projectName,
+                },
+            },
+        }
+
+        res, err := reqK8s.AuthorizationV1().SelfSubjectAccessReviews().Create(
+            context.Background(), ssar, v1.CreateOptions{})
+
+        if err != nil || !res.Status.Allowed {
+            c.JSON(http.StatusForbidden, gin.H{
+                "error": fmt.Sprintf("No access to project %s", projectName),
+            })
+            c.Abort()
+            return
+        }
+
+        // Store project in context for handler
+        c.Set("project", projectName)
+        c.Next()
+    }
+}
+```
+
+### Handler Usage
+
+**Example** (`components/backend/handlers/sessions.go`):
+
+```go
+func ListSessions(c *gin.Context) {
+    project := c.GetString("project")  // From middleware
+
+    // Get user-scoped K8s clients
+    _, reqDyn := GetK8sClientsForRequest(c)
+    if reqDyn == nil {
+        c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid token"})
+        return
+    }
+
+    gvr := schema.GroupVersionResource{
+        Group:    "vteam.ambient-code",
+        Version:  "v1alpha1",
+        Resource: "agenticsessions",
+    }
+
+    // List sessions using user token (RBAC enforced by K8s)
+    list, err := reqDyn.Resource(gvr).Namespace(project).List(
+        context.Background(), v1.ListOptions{})
+
+    if err != nil {
+        log.Printf("Failed to list sessions in project %s: %v", project, err)
+        c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to list sessions"})
+        return
+    }
+
+    c.JSON(http.StatusOK, gin.H{"items": list.Items})
+}
+```
+
+**Key Security Patterns:**
+
+1. **Always use user token** for user-initiated operations
+2. **Never fall back** to service account if user token is invalid
+3. **Validate RBAC** before resource access
+4. **Log securely** - never log token values (use `len(token)`)
+5. **Return 401** for auth failures, **403** for authorization failures
+
+**Reference:** [Backend Development Standards](../../CLAUDE.md#user-scoped-clients-for-api-operations)
+
+---
+
+## Service Account Usage
+
+### Backend Service Account
+
+**Purpose:** Limited elevated operations
+
+**Permissions:**
+- Create/update Custom Resources (after user validation)
+- Create Secrets for runner token minting
+- Read ProjectSettings for configuration
+
+**Usage Pattern:**
+
+```go
+// ONLY use backend service account for:
+// 1. Writing CRs after user token validation
+// 2. Minting runner tokens
+
+func CreateSession(c *gin.Context) {
+    project := c.GetString("project")
+
+    // Step 1: Validate user has permission using USER TOKEN
+    reqK8s, reqDyn := GetK8sClientsForRequest(c)
+    if reqK8s == nil {
+        c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid token"})
+        return
+    }
+
+    // Validate user can create sessions
+    if !userCanCreateSessions(reqK8s, project) {
+        c.JSON(http.StatusForbidden, gin.H{"error": "No permission to create sessions"})
+        return
+    }
+
+    // Step 2: Create CR using BACKEND SERVICE ACCOUNT
+    // (user token may not have write permissions on status subresource)
+    obj := buildSessionObject(...)
+
+    created, err := DynamicClient.Resource(gvr).Namespace(project).Create(
+        context.Background(), obj, v1.CreateOptions{})
+
+    if err != nil {
+        log.Printf("Failed to create session: %v", err)
+        c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create session"})
+        return
+    }
+
+    // Step 3: Mint token for runner using BACKEND SERVICE ACCOUNT
+    runnerToken, err := mintRunnerToken(project, created.GetName())
+    if err != nil {
+        log.Printf("Failed to mint runner token: %v", err)
+        // Continue - operator can handle missing token
+    }
+
+    c.JSON(http.StatusCreated, gin.H{
+        "name": created.GetName(),
+        "uid":  created.GetUID(),
+    })
+}
+```
+
+**Never Use Backend Service Account For:**
+- ❌ List/Get operations on behalf of users
+- ❌ Delete operations initiated by users
+- ❌ Skipping RBAC validation
+- ❌ Accessing resources user doesn't have permission for
+
+---
+
+### Operator Service Account
+
+**Purpose:** Watch and reconcile Custom Resources
+
+**Permissions:**
+- Watch all Custom Resources (cluster-wide or namespace-scoped)
+- Create/delete Jobs
+- Create/delete Secrets
+- Update CR status subresource
+
+**Usage Pattern:**
+
+```go
+// Operator uses its service account for ALL operations
+func WatchAgenticSessions() {
+    gvr := types.GetAgenticSessionResource()
+
+    // Watch using operator's service account
+    watcher, err := config.DynamicClient.Resource(gvr).Watch(
+        context.Background(), v1.ListOptions{})
+
+    if err != nil {
+        log.Printf("Failed to create watcher: %v", err)
+        return
+    }
+
+    for event := range watcher.ResultChan() {
+        obj := event.Object.(*unstructured.Unstructured)
+        handleAgenticSession(obj)
+    }
+}
+```
+
+**Note:** Operator has **cluster-wide permissions** to watch and reconcile resources across all namespaces. This is acceptable because:
+1. Operator is trusted infrastructure component
+2. Operator only automates declarative state (no user input)
+3. Operator does not expose user-facing API
+
+---
+
+### Runner Service Account
+
+**Purpose:** Update CR status from pod
+
+**Permissions:**
+- Update `/status` subresource for parent AgenticSession
+- Read ConfigMaps/Secrets in namespace
+- Limited read access to other CRs (for RFE workflows)
+
+**Token Minting:**
+
+Backend mints a time-limited token for runner:
+
+```go
+func mintRunnerToken(namespace, sessionName string) (string, error) {
+    // Create ServiceAccount for runner
+    sa := &corev1.ServiceAccount{
+        ObjectMeta: v1.ObjectMeta{
+            Name:      fmt.Sprintf("runner-%s", sessionName),
+            Namespace: namespace,
+        },
+    }
+
+    _, err := K8sClient.CoreV1().ServiceAccounts(namespace).Create(
+        context.Background(), sa, v1.CreateOptions{})
+
+    if err != nil && !errors.IsAlreadyExists(err) {
+        return "", err
+    }
+
+    // Create token for ServiceAccount
+    treq := &authv1.TokenRequest{
+        Spec: authv1.TokenRequestSpec{
+            ExpirationSeconds: int64Ptr(3600),  // 1 hour
+        },
+    }
+
+    token, err := K8sClient.CoreV1().ServiceAccounts(namespace).CreateToken(
+        context.Background(), sa.Name, treq, v1.CreateOptions{})
+
+    if err != nil {
+        return "", err
+    }
+
+    return token.Status.Token, nil
+}
+```
+
+**Usage in Runner:**
+
+```python
+# Runner reads minted token from environment
+token = os.environ.get("RUNNER_TOKEN")
+
+# Use token to update CR status
+requests.patch(
+    f"{k8s_api}/apis/vteam.ambient-code/v1alpha1/namespaces/{namespace}/agenticsessions/{name}/status",
+    headers={"Authorization": f"Bearer {token}"},
+    json={"status": {"results": results}}
+)
+```
+
+---
+
+## Isolation Guarantees
+
+### Namespace Isolation
+
+**What's Isolated:**
+- ✓ Custom Resources (AgenticSession, ProjectSettings, RFEWorkflow)
+- ✓ Jobs and Pods
+- ✓ Secrets and ConfigMaps
+- ✓ PersistentVolumeClaims
+- ✓ NetworkPolicies (if configured)
+
+**What's Shared:**
+- Kubernetes cluster infrastructure (nodes, storage classes)
+- CRDs (cluster-scoped)
+- ClusterRoles and ClusterRoleBindings
+- Platform services (backend, operator)
+
+### RBAC Isolation
+
+**User A (team-alpha):**
+- ✓ Can list/create/delete sessions in `team-alpha` namespace
+- ❌ Cannot list sessions in `team-beta` namespace
+- ❌ Cannot modify ProjectSettings in `team-gamma` namespace
+
+**User B (team-beta):**
+- ✓ Can list sessions in `team-beta` namespace
+- ❌ Cannot access `team-alpha` resources
+- ❌ Cannot create sessions in `team-gamma` namespace
+
+**Enforcement:**
+- Backend validates user token + RBAC before operations
+- Kubernetes API enforces RBAC on every request
+- Operator uses namespace-scoped clients where possible
+
+### Resource Quotas (Optional)
+
+**Per-Namespace Limits:**
+
+```yaml
+apiVersion: v1
+kind: ResourceQuota
+metadata:
+  name: project-quota
+  namespace: team-alpha
+spec:
+  hard:
+    requests.cpu: "10"
+    requests.memory: "20Gi"
+    limits.cpu: "20"
+    limits.memory: "40Gi"
+    pods: "50"
+    persistentvolumeclaims: "10"
+```
+
+**Prevents:**
+- Resource exhaustion by single tenant
+- Noisy neighbor problems
+- Runaway session costs
+
+---
+
+## Security Boundaries
+
+```mermaid
+graph TB
+    subgraph "External"
+        User[User Browser]
+        Git[Git Repositories]
+    end
+
+    subgraph "Platform Boundary"
+        OAuth[OAuth Proxy<br/>Authentication]
+    end
+
+    subgraph "API Boundary"
+        BE[Backend API<br/>RBAC Validation]
+    end
+
+    subgraph "Kubernetes RBAC Boundary"
+        K8s[Kubernetes API<br/>Token + RBAC enforcement]
+    end
+
+    subgraph "Namespace: team-alpha"
+        NSA[Resources for team-alpha]
+        PodA[Runner Pod A]
+    end
+
+    subgraph "Namespace: team-beta"
+        NSB[Resources for team-beta]
+        PodB[Runner Pod B]
+    end
+
+    User -->|HTTPS| OAuth
+    OAuth -->|Token| BE
+    BE -->|User Token| K8s
+
+    K8s -->|RBAC allows| NSA
+    K8s -.->|RBAC denies| NSB
+
+    NSA -->|Contains| PodA
+    NSB -->|Contains| PodB
+
+    PodA -.->|Cannot access| NSB
+    PodB -.->|Cannot access| NSA
+
+    PodA -->|Can clone| Git
+    PodB -->|Can clone| Git
+
+    style OAuth fill:#ffe1e1
+    style BE fill:#fff4e1
+    style K8s fill:#f0e1ff
+    style NSA fill:#e1f5ff
+    style NSB fill:#ffe1e1
+```
+
+**Security Layers:**
+
+1. **OAuth Proxy:** Ensures user is authenticated
+2. **Backend API:** Validates user token + RBAC permissions
+3. **Kubernetes API:** Enforces RBAC on every resource access
+4. **Namespace Isolation:** Resources cannot cross boundaries
+5. **NetworkPolicies (optional):** Restrict pod-to-pod communication
+
+---
+
+## Project Lifecycle
+
+### Project Creation
+
+```mermaid
+sequenceDiagram
+    actor Admin
+    participant UI as Frontend
+    participant API as Backend API
+    participant K8s as Kubernetes
+
+    Admin->>UI: Create new project "team-delta"
+    UI->>API: POST /api/projects<br/>{"name": "team-delta"}
+
+    API->>K8s: Create Namespace<br/>name: team-delta
+
+    K8s-->>API: Namespace created
+
+    API->>K8s: Create RoleBinding<br/>vteam-admin → admin user
+
+    API->>K8s: Create ProjectSettings CR<br/>(default configuration)
+
+    K8s-->>API: Resources created
+
+    API-->>UI: 201 Created
+    UI-->>Admin: Project ready
+```
+
+### Project Deletion
+
+```mermaid
+sequenceDiagram
+    actor Admin
+    participant UI as Frontend
+    participant API as Backend API
+    participant K8s as Kubernetes
+
+    Admin->>UI: Delete project "team-delta"
+    UI->>API: DELETE /api/projects/team-delta
+
+    API->>K8s: Delete Namespace<br/>team-delta
+
+    Note over K8s: Cascade delete ALL resources:<br/>- AgenticSessions<br/>- Jobs/Pods<br/>- Secrets<br/>- PVCs<br/>- ProjectSettings
+
+    K8s-->>API: Namespace deleted
+
+    API-->>UI: 204 No Content
+    UI-->>Admin: Project deleted
+```
+
+**Cleanup:**
+- Kubernetes automatically deletes all resources in namespace
+- No manual cleanup required
+- PVCs deleted (data loss - consider backups)
+
+---
+
+## Related Documentation
+
+- [Core System Architecture](./core-system-architecture.md) - Component overview
+- [Agentic Session Lifecycle](./agentic-session-lifecycle.md) - Session execution flow
+- [Backend Development Standards](../../CLAUDE.md#backend-and-operator-development-standards)
+- [ADR-0001: Kubernetes-Native Architecture](../adr/0001-kubernetes-native-architecture.md)
+- [ADR-0002: User Token Authentication](../adr/0002-user-token-authentication.md)
+- [Security Standards Context](./.claude/context/security-standards.md)
diff --git a/docs/index.md b/docs/index.md
index 6a6a8db58..de9813c97 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -17,6 +17,8 @@ The platform follows a cloud-native microservices architecture:
 - Custom Resource Definitions (AgenticSession, ProjectSettings, RFEWorkflow)
 - Operator-based reconciliation for declarative session management
 
+📐 **[Architecture Diagrams](architecture/index.md)** - Visual guides to system design, component interactions, and data flows
+
 ## Quick Start
 
 ### Local Development
@@ -64,6 +66,13 @@ For production OpenShift clusters:
 
 ## Documentation Structure
 
+### [📐 Architecture](architecture/index.md)
+Visual guides and detailed explanations of the platform's design:
+- [Core System Architecture](architecture/core-system-architecture.md) - 4-component system overview
+- [Agentic Session Lifecycle](architecture/agentic-session-lifecycle.md) - State machine and reconciliation
+- [Multi-Tenancy Architecture](architecture/multi-tenancy-architecture.md) - Project isolation and RBAC
+- [Kubernetes Resources](architecture/kubernetes-resources.md) - CRD structures and schemas
+
 ### [📘 User Guide](user-guide/index.md)
 Learn how to use the Ambient Code Platform for AI-powered automation:
 - [Getting Started](user-guide/getting-started.md) - Installation and first session
diff --git a/mkdocs.yml b/mkdocs.yml
index 0d80bbeae..c64a3116f 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -41,6 +41,12 @@ theme:
 
 nav:
   - Home: index.md
+  - Architecture:
+    - Overview: architecture/index.md
+    - Core System Architecture: architecture/core-system-architecture.md
+    - Agentic Session Lifecycle: architecture/agentic-session-lifecycle.md
+    - Multi-Tenancy Architecture: architecture/multi-tenancy-architecture.md
+    - Kubernetes Resources: architecture/kubernetes-resources.md
   - User Guide:
     - Overview: user-guide/index.md
     - Getting Started: user-guide/getting-started.md