Added realtime-sync.md and auth.md docs

ALTM005 · ALTM005 · commit 2412e7a17c50 · 2026-01-17T00:33:09.000-08:00
diff --git a/docs/auth.md b/docs/auth.md
@@ -0,0 +1,70 @@
+# Authentication & Security
+
+This document outlines how CodenCollab secures user data and prevents unauthorized access. We rely on **Supabase** for identity management and standard **JWT (JSON Web Token)** verification for access control.
+
+## The Authentication Flow
+
+We treat the Frontend and Backend as two separate entities that do not trust each other. The **JWT** is the only proof of identity accepted by the backend.
+
+### 1. Client-Side Login (Frontend)
+* The user logs in via the React frontend using the Supabase Client SDK.
+* Supabase returns a session object containing an `access_token` (JWT) and a `refresh_token`.
+* This token is stored securely in the browser (handled automatically by the Supabase client).
+
+### 2. Protecting API Routes (REST)
+When a user attempts to run code or fetch sensitive data, the frontend attaches the token to the HTTP Headers:
+
+```http
+POST /rooms/123/run HTTP/1.1
+Authorization: Bearer <SUPABASE_ACCESS_TOKEN>
+
+```
+
+* **Backend Check:** The FastAPI backend extracts this token and verifies its signature against our **Supabase JWT Secret**.
+* **Expiration:** If the token is expired, the request is immediately rejected with a `401 Unauthorized` error.
+
+### 3. Protecting the WebSocket (Real-Time)
+
+Securing WebSockets is trickier because they are persistent connections.
+
+* **Handshake:** During the initial Socket.IO handshake, the client should pass the token in the `auth` object.
+* **Connection Guard:** The backend validates this token *before* allowing the socket to connect. If the token is invalid, the socket is disconnected immediately, preventing the user from receiving any real-time updates.
+
+---
+
+## Server-Side Access Control
+
+Authentication identifies *who* the user is. Authorization determines *what* they can do.
+
+### Room-Level Security
+
+Just because a user is logged in doesn't mean they can enter any room.
+
+* **Join Event:** When a socket emits `join { room_id }`, the backend checks if that room is public or private. If private, it verifies if the user's ID is on the allowlist for that room.
+
+### Execution Security
+
+Running code is the most dangerous action in the app.
+
+* **Validation:** The `/run` endpoint is strictly protected. We verify the JWT on every single execution request.
+* **Sanitization:** Input is stripped of obvious malicious shell characters before being sent to the sandbox.
+* **Isolation:** Execution happens in **Piston**, a completely separate environment. Even if a user bypasses backend checks, they are trapped in a Docker container with no network access to our database.
+
+---
+
+## Security Best Practices
+
+### 1. Secret Management
+
+* **Service Role Keys:** We strictly keep the `SUPABASE_SERVICE_ROLE_KEY` on the backend. This key has admin rights and is **never** exposed to the frontend/browser.
+* **Environment Variables:** All secrets are loaded via `.env` files and are never committed to Git.
+
+### 2. Network Security
+
+* **HTTPS/WSS:** In production, all traffic (REST and WebSockets) must be encrypted via TLS.
+* **CORS (Cross-Origin Resource Sharing):** We explicitly whitelist only our frontend domain (e.g., `https://codencollab-app.vercel.app`) to prevent malicious websites from triggering actions on behalf of a user.
+
+### 3. Token Lifecycle
+
+* Supabase tokens are short-lived (usually 1 hour).
+* The frontend handles token refreshing automatically. If a session expires during a coding session, the client will silently refresh the token and reconnect the socket seamlessly.
diff --git a/docs/realtime-sync.md b/docs/realtime-sync.md
@@ -0,0 +1,47 @@
+# Real-Time Synchronization Strategy
+
+This document explains the practical implementation details of CodenCollab's synchronization engine. It covers how we keep clients in sync, handle network "chattiness," and prevent infinite update loops.
+
+## Core Architecture: Room Scoping
+We utilize **Socket.IO Rooms** to isolate collaboration sessions.
+- **Strict Isolation:** All events (`code-update`, `cursor`, `execution-result`) are emitted to a specific `room_id`.
+- **Privacy:** This ensures that actions in "Room A" are never broadcast to "Room B," maintaining data privacy and reducing unnecessary server load.
+
+## The "Echo" Problem (Feedback Loops)
+The hardest part of real-time syncing is preventing an infinite loop where:
+1. User A types a character.
+2. Server sends it to User B.
+3. User B's editor applies the change programmatically.
+4. User B's editor detects a "change" and sends it *back* to User A.
+5. Repeat forever until the browser crashes.
+
+### Our Solution: The `isApplyingRemoteChange` Flag
+We use a React `useRef` boolean to distinguish between *human* edits and *socket* edits.
+
+1. **Incoming Event:** When a `code-update` event arrives, we set `isApplyingRemoteChange.current = true`.
+2. **Apply Edit:** We apply the delta to the Monaco Model.
+3. **Reset:** Immediately after application, we set the flag back to `false`.
+4. **Event Listener:** Our local `onDidChangeModelContent` listener checks this flag first. If it is `true`, it knows the change came from the server, so it **ignores it** and does not emit a socket event.
+
+## Cursor Optimization (Debouncing)
+Cursor movements are extremely high-frequency events. Sending a packet for every pixel a mouse moves would flood the network and degrade performance.
+
+### Strategy
+- **Debouncing:** We implement a **100ms throttle** on the client side.
+- **Logic:** If a user moves their cursor, we wait 100ms. If they move again within that window, we reset the timer. We only emit the final position once the user stops moving or the interval passes.
+- **Result:** This reduces network traffic by approximately 90% while still looking "live" to other users.
+
+## Handling Decorations (Visuals)
+To render remote cursors, we do not insert text into the document. Instead, we use Monaco's **Decorations API**.
+
+- **Implementation:** We maintain a `decorationsCollectionRef`.
+- **Update Cycle:** When a `cursor` event arrives, we calculate the new positions and overwrite the collection using `set()`.
+- **Memory Management:** Monaco handles the cleanup of old decorations automatically when we overwrite them, preventing memory leaks in long sessions.
+
+## Conflict Resolution & Limitations
+Currently, CodenCollab uses a **Server-Relay** (Last Write Wins) approach.
+- **Pros:** Extremely low latency and simple codebase. Perfect for pair programming (2-3 people).
+- **Cons:** Not mathematically conflict-free. If two users type on the exact same line at the exact same millisecond, consistency is not guaranteed.
+
+### Future Migration Path
+If the project scales to support large classrooms or offline-first editing, we plan to migrate the state management to a **CRDT** (Conflict-free Replicated Data Type) library like **Yjs**. This would allow for decentralized, mathematical conflict resolution at the cost of higher complexity.