From 4d908a0f53229cc10195ea3cf690dcb245eb6c4b Mon Sep 17 00:00:00 2001
From: Devanshu Rajesh Chicholikar <chicholikar.d@northeastern.edu>
Date: Thu, 8 Jan 2026 17:27:31 -0500
Subject: [PATCH 1/4] chore: remove legacy code and internal docs

- Remove legacy/ folder (old unused code)
- Remove SETUP_COMPLETE.md (internal doc)
- Remove docs/HANDOFF-114.md (internal handoff)
- Remove docs/TIER_SYSTEM_DESIGN.md (internal design doc)

Part of #180
---
 SETUP_COMPLETE.md           | 279 --------------------------
 docs/HANDOFF-114.md         |  60 ------
 docs/TIER_SYSTEM_DESIGN.md  | 387 ------------------------------------
 legacy/IndexingProgress.tsx |  95 ---------
 legacy/README.md            |  23 ---
 legacy/indexer_old.py       | 362 ---------------------------------
 legacy/repo_manager_old.py  | 125 ------------
 7 files changed, 1331 deletions(-)
 delete mode 100644 SETUP_COMPLETE.md
 delete mode 100644 docs/HANDOFF-114.md
 delete mode 100644 docs/TIER_SYSTEM_DESIGN.md
 delete mode 100644 legacy/IndexingProgress.tsx
 delete mode 100644 legacy/README.md
 delete mode 100644 legacy/indexer_old.py
 delete mode 100644 legacy/repo_manager_old.py

diff --git a/SETUP_COMPLETE.md b/SETUP_COMPLETE.md
deleted file mode 100644
index 80ffe6a..0000000
--- a/SETUP_COMPLETE.md
+++ /dev/null
@@ -1,279 +0,0 @@
-# 🎉 CodeIntel Docker & Deployment Setup Complete!
-
-## ✅ What's Ready
-
-### 1. Docker Configuration
-- ✅ `docker-compose.yml` - Production setup
-- ✅ `docker-compose.dev.yml` - Development with hot reload  
-- ✅ Backend `Dockerfile` - Multi-stage, optimized
-- ✅ Frontend `Dockerfile` - Nginx production build
-- ✅ Root `.env` file - All API keys configured
-- ✅ `.gitignore` updated - API keys won't leak
-
-### 2. Deployment Files
-- ✅ `DEPLOYMENT.md` - Complete deployment guide (337 lines)
-- ✅ `DOCKER_QUICKSTART.md` - 5-minute quick start (197 lines)
-- ✅ `DOCKER_TROUBLESHOOTING.md` - Common issues & fixes (284 lines)
-- ✅ `railway.json` - Railway config
-- ✅ Deployment scripts (executable):
-  - `scripts/deploy-railway.sh` - Backend to Railway
-  - `scripts/deploy-vercel.sh` - Frontend to Vercel
-  - `scripts/verify-setup.sh` - Pre-deployment checks
-
-### 3. Developer Experience
-- ✅ `Makefile` - 20+ commands for dev workflow
-- ✅ README updated - Docker section added
-- ✅ Health checks - All services monitored
-- ✅ Graceful restarts - No data loss
-- ✅ Redis persistence - AOF enabled
-
-## 🚀 Quick Start Commands
-
-### Local Development
-```bash
-# Verify setup
-./scripts/verify-setup.sh
-
-# Start everything
-make dev
-# OR
-docker compose up -d
-
-# View logs
-make logs
-
-# Stop
-make stop
-```
-
-**Access at:**
-- Frontend: http://localhost:3000
-- Backend: http://localhost:8000  
-- API Docs: http://localhost:8000/docs
-- Redis: localhost:6379
-
-### Production Deployment
-
-**Option 1: Automated Scripts**
-```bash
-# Deploy backend to Railway
-./scripts/deploy-railway.sh
-
-# Deploy frontend to Vercel  
-./scripts/deploy-vercel.sh
-```
-
-**Option 2: Makefile**
-```bash
-make deploy-backend
-make deploy-frontend
-# OR
-make deploy-all
-```
-
-**Option 3: Manual**
-See `DEPLOYMENT.md` for step-by-step guide
-
-## 📋 Pre-Deployment Checklist
-
-Before deploying to production, make sure:
-
-- [ ] Docker Desktop is running
-- [ ] All API keys are set in `.env`
-- [ ] Tests passing: `make test`
-- [ ] Local Docker works: `make dev`
-- [ ] Health check passes: `make health`
-- [ ] Railway CLI installed: `npm i -g @railway/cli`
-- [ ] Vercel CLI installed: `npm i -g vercel`
-- [ ] Changed `API_KEY` from default value
-- [ ] Supabase RLS policies configured
-- [ ] Read through `DEPLOYMENT.md`
-
-## 🎯 Next Steps
-
-### 1. Test Locally
-```bash
-# Start services
-make dev
-
-# In another terminal, run tests
-make test
-
-# Check everything is healthy
-make health
-```
-
-### 2. Deploy Backend (Railway)
-```bash
-# Automated
-./scripts/deploy-railway.sh
-
-# Follow prompts to:
-# - Login to Railway
-# - Create/link project
-# - Add Redis service
-# - Set environment variables
-# - Deploy
-```
-
-### 3. Deploy Frontend (Vercel)
-```bash
-# Get your Railway backend URL first
-railway domain
-
-# Then deploy frontend
-./scripts/deploy-vercel.sh
-
-# Enter Railway URL when prompted
-```
-
-### 4. Configure Production
-After deployment:
-1. Update CORS in `backend/main.py` with Vercel URL
-2. Test all endpoints work
-3. Monitor logs: `railway logs -f`
-4. Set up custom domains (optional)
-
-## 📖 Documentation Reference
-
-| Document | Purpose |
-|----------|---------|
-| `README.md` | Project overview, features, quick start |
-| `DOCKER_QUICKSTART.md` | Get running in 5 minutes |
-| `DOCKER_TROUBLESHOOTING.md` | Fix common Docker issues |
-| `DEPLOYMENT.md` | Complete deployment guide |
-| `SECURITY.md` | Security practices & vulnerability reporting |
-| `CONTRIBUTING.md` | How to contribute |
-
-## 🔧 Useful Commands
-
-### Docker
-```bash
-make dev              # Start dev environment
-make prod             # Start production environment
-make logs             # View all logs
-make stop             # Stop services
-make clean            # Nuclear option - remove everything
-make health           # Check service health
-make restart-backend  # Quick backend restart
-```
-
-### Testing
-```bash
-make test            # Run tests
-make test-watch      # Watch mode
-make coverage        # Coverage report
-```
-
-### Deployment
-```bash
-make deploy-backend   # Deploy to Railway
-make deploy-frontend  # Deploy to Vercel
-make deploy-all       # Deploy everything
-```
-
-### Debugging
-```bash
-make shell-backend    # Bash into backend container
-make shell-redis      # Redis CLI
-make redis-stats      # View Redis info
-docker compose ps     # Check container status
-docker compose logs -f backend  # Follow backend logs
-```
-
-## 🐛 Common Issues
-
-| Issue | Quick Fix |
-|-------|-----------|
-| Docker daemon not running | Open Docker Desktop |
-| Port already in use | `lsof -i :8000` and kill process |
-| Env vars not found | Make sure `.env` exists in project root |
-| Build fails | `make clean && make build` |
-| Services keep restarting | Check logs: `make logs` |
-
-**Full troubleshooting:** See `DOCKER_TROUBLESHOOTING.md`
-
-## 📊 What Got Built
-
-### Architecture
-```
-┌─────────────┐      ┌─────────────┐      ┌─────────────┐
-│  Frontend   │─────▶│   Backend   │─────▶│    Redis    │
-│  Vite+React │      │   FastAPI   │      │   Cache     │
-│   Port 3000 │      │  Port 8000  │      │  Port 6379  │
-└─────────────┘      └─────────────┘      └─────────────┘
-                            │
-                            ├────▶ Supabase (Postgres)
-                            └────▶ Pinecone (Vectors)
-```
-
-### Files Created/Updated
-- ✅ `.env` - Root environment variables
-- ✅ `docker-compose.yml` - Production services (removed obsolete `version`)
-- ✅ `docker-compose.dev.yml` - Dev services (removed obsolete `version`)
-- ✅ `DOCKER_QUICKSTART.md` - Quick start guide
-- ✅ `DOCKER_TROUBLESHOOTING.md` - Troubleshooting guide
-- ✅ `scripts/verify-setup.sh` - Pre-deployment verification (made executable)
-- ✅ `README.md` - Added Docker quick start section
-
-### Already Existing (Verified Working)
-- ✅ `backend/Dockerfile` - Production-ready
-- ✅ `frontend/Dockerfile` - Multi-stage build with nginx
-- ✅ `railway.json` - Railway configuration
-- ✅ `DEPLOYMENT.md` - Comprehensive deployment guide
-- ✅ `Makefile` - Developer commands
-- ✅ `scripts/deploy-railway.sh` - Railway deployment
-- ✅ `scripts/deploy-vercel.sh` - Vercel deployment
-
-## 🎓 What You Learned
-
-This setup demonstrates:
-1. **Production-grade Docker Compose** - Multi-service orchestration
-2. **Multi-stage builds** - Optimized image sizes
-3. **Health checks** - Service monitoring
-4. **Environment management** - Secrets handling
-5. **Deployment automation** - Scripts for Railway/Vercel
-6. **Developer experience** - Makefile commands, hot reload
-7. **Documentation** - Comprehensive guides for users
-
-## 💰 Expected Costs
-
-**Hobby/Free Tier:**
-- Railway: $5/month credit (backend + Redis)
-- Vercel: Free for personal projects
-- **Total: $0-5/month**
-
-**Production:**
-- Railway Pro: $20/month
-- Vercel Pro: $20/month  
-- OpenAI API: ~$10-50/month
-- Pinecone Starter: $70/month
-- **Total: ~$120-160/month**
-
-## 🎉 You're Ready!
-
-Your CodeIntel project is now:
-- ✅ Docker Compose ready for local dev
-- ✅ Production-ready Dockerfiles
-- ✅ Deployment scripts for Railway + Vercel
-- ✅ Comprehensive documentation
-- ✅ Developer-friendly tooling
-
-**Start building:**
-```bash
-make dev
-open http://localhost:3000
-```
-
-**Deploy to production:**
-```bash
-./scripts/verify-setup.sh  # Verify first
-./scripts/deploy-railway.sh  # Deploy backend
-./scripts/deploy-vercel.sh   # Deploy frontend
-```
-
----
-
-**Questions?** Check `DOCKER_TROUBLESHOOTING.md` or open an issue on GitHub.
-
-**Ready to ship!** 🚀
diff --git a/docs/HANDOFF-114.md b/docs/HANDOFF-114.md
deleted file mode 100644
index a938fc6..0000000
--- a/docs/HANDOFF-114.md
+++ /dev/null
@@ -1,60 +0,0 @@
-# Handoff: Anonymous Indexing (#114)
-
-## TL;DR
-Let users index their own GitHub repos without signup. 5 backend endpoints needed.
-
-## GitHub Issues (Full Specs)
-- **#124** - Validate GitHub URL
-- **#125** - Start anonymous indexing  
-- **#126** - Get indexing status
-- **#127** - Extend session management
-- **#128** - Update search for user repos
-
-**Read these first.** Each has request/response schemas, implementation notes, acceptance criteria.
-
-## Order of Work
-```
-#127 + #124 (parallel) → #125 → #126 → #128
-```
-
-## Key Files to Understand
-
-| File | What It Does |
-|------|--------------|
-| `backend/config/api.py` | API versioning (`/api/v1/*`) |
-| `backend/routes/playground.py` | Existing playground endpoints |
-| `backend/services/playground_limiter.py` | Session + rate limiting |
-| `backend/services/repo_validator.py` | File counting, extensions |
-| `backend/dependencies.py` | Indexer, cache, redis_client |
-
-## Constraints (Anonymous Users)
-- 200 files max
-- 1 repo per session
-- 50 searches per session
-- 24hr TTL
-
-## Workflow
-See `CONTRIBUTING.md` for full guide.
-
-**Quick version:**
-```bash
-# Create branch
-git checkout -b feat/124-validate-repo
-
-# Make changes, test
-pytest tests/ -v
-
-# Commit
-git add .
-git commit -m "feat(playground): add validate-repo endpoint"
-
-# Push to YOUR fork
-git push origin feat/124-validate-repo
-
-# Create PR on OpenCodeIntel/opencodeintel
-# Reference issue: "Closes #124"
-```
-
-## Questions?
-- Check GitHub issues first
-- Ping Devanshu for blockers
diff --git a/docs/TIER_SYSTEM_DESIGN.md b/docs/TIER_SYSTEM_DESIGN.md
deleted file mode 100644
index d5b093f..0000000
--- a/docs/TIER_SYSTEM_DESIGN.md
+++ /dev/null
@@ -1,387 +0,0 @@
-# User Tier & Limits System - Design Document
-
-> **Issues**: #93, #94, #95, #96, #97
-> **Author**: Devanshu
-> **Status**: Implemented
-> **Last Updated**: 2025-12-13
-
----
-
-## 1. Problem Statement
-
-CodeIntel needs a tiered system to:
-1. **Protect costs** - Indexing is expensive ($0.02-$50/repo depending on size)
-2. **Enable growth** - Freemium model with upgrade path
-3. **Prevent abuse** - Rate limit anonymous playground users
-
-**Key Insight**: Searching is nearly free ($0.000001/query). Indexing is the real cost driver.
-
----
-
-## 2. Tier Definitions
-
-| Tier | Max Repos | Files/Repo | Functions/Repo | Playground/Day |
-|------|-----------|------------|----------------|----------------|
-| **Free** | 3 | 500 | 2,000 | 50 |
-| **Pro** | 20 | 5,000 | 20,000 | Unlimited |
-| **Enterprise** | Unlimited | 50,000 | 200,000 | Unlimited |
-
-**Rationale**:
-- Free tier: Enough for personal projects, not enterprise codebases
-- Playground limit: 50/day is generous (anti-abuse, not business gate)
-- File/function limits: Prevent expensive indexing jobs
-
----
-
-## 3. Current API Endpoints
-
-### 3.1 Authentication (`/api/v1/auth`)
-| Method | Endpoint | Auth | Description |
-|--------|----------|------|-------------|
-| POST | `/signup` | None | Create account |
-| POST | `/login` | None | Get JWT |
-| POST | `/refresh` | JWT | Refresh token |
-| POST | `/logout` | JWT | Invalidate session |
-| GET | `/me` | JWT | Get current user |
-
-### 3.2 Repositories (`/api/v1/repos`)
-| Method | Endpoint | Auth | Description | **Limits Check** |
-|--------|----------|------|-------------|------------------|
-| GET | `/` | JWT | List user repos | - |
-| POST | `/` | JWT | Add repo | **#95: Check repo count** |
-| POST | `/{id}/index` | JWT | Index repo | **#94: Check file/function count** |
-
-### 3.3 Search (`/api/v1/search`)
-| Method | Endpoint | Auth | Description | **Limits Check** |
-|--------|----------|------|-------------|------------------|
-| POST | `/search` | JWT | Search code | - |
-| POST | `/explain` | JWT | Explain code | - |
-
-### 3.4 Playground (`/api/v1/playground`) - **Anonymous**
-| Method | Endpoint | Auth | Description | **Limits Check** |
-|--------|----------|------|-------------|------------------|
-| GET | `/repos` | None | List demo repos | - |
-| POST | `/search` | None | Search demo repos | **#93: Rate limit 50/day** |
-
-### 3.5 Analysis (`/api/v1/analysis`)
-| Method | Endpoint | Auth | Description |
-|--------|----------|------|-------------|
-| GET | `/{id}/dependencies` | JWT | Dependency graph |
-| POST | `/{id}/impact` | JWT | Impact analysis |
-| GET | `/{id}/insights` | JWT | Repo insights |
-| GET | `/{id}/style-analysis` | JWT | Code style |
-
-### 3.6 Users (`/api/v1/users`) - **NEW**
-| Method | Endpoint | Auth | Description |
-|--------|----------|------|-------------|
-| GET | `/usage` | JWT | Get tier, limits, current usage |
-| GET | `/limits/check-repo-add` | JWT | Pre-check before adding repo |
-
----
-
-## 4. Implementation Plan by Issue
-
-### Issue #96: User Tier System (Foundation) ✅ DONE
-**Files Created**:
-- `backend/services/user_limits.py` - Core service
-- `backend/routes/users.py` - API endpoints
-- `supabase/migrations/001_user_profiles.sql` - DB schema
-
-**Service Methods**:
-```python
-class UserLimitsService:
-    def get_user_tier(user_id) -> UserTier
-    def get_user_limits(user_id) -> TierLimits
-    def get_user_repo_count(user_id) -> int
-    def check_repo_count(user_id) -> LimitCheckResult
-    def check_repo_size(user_id, file_count, func_count) -> LimitCheckResult
-    def get_usage_summary(user_id) -> dict
-    def invalidate_tier_cache(user_id) -> None  # Call after tier upgrade
-```
-
-### Issue #95: Repo Count Limits
-**Where**: `POST /api/v1/repos`
-
-**Changes to `routes/repos.py`**:
-```python
-@router.post("")
-def add_repository(request, auth):
-    # NEW: Check repo count limit
-    result = user_limits.check_repo_count(auth.user_id)
-    if not result.allowed:
-        raise HTTPException(
-            status_code=403,
-            detail=result.to_dict()
-        )
-    # ... existing code
-```
-
-**Frontend Integration**:
-- Call `GET /users/limits/check-repo-add` before showing Add Repo button
-- Show "2/3 repos used" in sidebar
-- Show upgrade prompt when limit reached
-
-### Issue #94: Repo Size Limits
-**Where**: `POST /api/v1/repos/{id}/index`
-
-**Changes to `routes/repos.py`**:
-```python
-@router.post("/{repo_id}/index")
-def index_repository(repo_id, auth):
-    repo = get_repo_or_404(repo_id, auth.user_id)
-    
-    # Count files and estimate functions BEFORE indexing
-    file_count = count_code_files(repo["local_path"])
-    estimated_functions = file_count * 25  # Conservative estimate
-    
-    # NEW: Check size limits
-    result = user_limits.check_repo_size(
-        auth.user_id, file_count, estimated_functions
-    )
-    if not result.allowed:
-        raise HTTPException(
-            status_code=400,
-            detail=result.to_dict()
-        )
-    # ... existing indexing code
-```
-
-### Issue #93: Playground Rate Limiting
-**Where**: `POST /api/v1/playground/search`
-
-**New File**: `backend/services/playground_rate_limiter.py`
-```python
-class PlaygroundRateLimiter:
-    def __init__(self, redis_client):
-        self.redis = redis_client
-        self.daily_limit = 50
-    
-    def check_and_increment(self, ip: str) -> tuple[bool, dict]:
-        """Returns (allowed, headers_dict)"""
-        key = f"playground:rate:{ip}"
-        
-        # Atomic increment
-        count = self.redis.incr(key)
-        if count == 1:
-            self.redis.expire(key, 86400)  # 24h TTL
-        
-        ttl = self.redis.ttl(key)
-        reset_time = int(time.time()) + ttl
-        
-        headers = {
-            "X-RateLimit-Limit": str(self.daily_limit),
-            "X-RateLimit-Remaining": str(max(0, self.daily_limit - count)),
-            "X-RateLimit-Reset": str(reset_time)
-        }
-        
-        if count > self.daily_limit:
-            headers["Retry-After"] = str(ttl)
-            return False, headers
-        
-        return True, headers
-```
-
-**Changes to `routes/playground.py`**:
-```python
-from fastapi import Request, Response
-
-@router.post("/search")
-def playground_search(request: Request, response: Response, body: SearchRequest):
-    # Get client IP
-    ip = request.client.host
-    forwarded = request.headers.get("X-Forwarded-For")
-    if forwarded:
-        ip = forwarded.split(",")[0].strip()
-    
-    # Check rate limit
-    allowed, headers = playground_rate_limiter.check_and_increment(ip)
-    
-    # Always add headers
-    for key, value in headers.items():
-        response.headers[key] = value
-    
-    if not allowed:
-        raise HTTPException(
-            status_code=429,
-            detail={
-                "error": "RATE_LIMIT_EXCEEDED",
-                "message": "Daily search limit reached. Sign up for unlimited searches!",
-                "limit": 50,
-                "reset": headers["X-RateLimit-Reset"]
-            }
-        )
-    
-    # ... existing search code
-```
-
-### Issue #97: Progressive Signup CTAs
-**Where**: Frontend only
-
-**Implementation**:
-```typescript
-// hooks/usePlaygroundUsage.ts
-const usePlaygroundUsage = () => {
-  const [searchCount, setSearchCount] = useState(0);
-  
-  // Read from response headers after each search
-  const trackSearch = (response: Response) => {
-    const remaining = response.headers.get('X-RateLimit-Remaining');
-    const limit = response.headers.get('X-RateLimit-Limit');
-    if (remaining && limit) {
-      setSearchCount(parseInt(limit) - parseInt(remaining));
-    }
-  };
-  
-  return { searchCount, trackSearch };
-};
-
-// Show CTAs at thresholds
-// 10 searches: Subtle "Want to search YOUR codebase?"
-// 25 searches: More prominent with feature list
-// 40 searches: Final "You clearly love this"
-```
-
----
-
-## 5. Error Response Format
-
-All limit-related errors use `LimitCheckResult.to_dict()`:
-
-```json
-{
-  "detail": {
-    "allowed": false,
-    "current": 3,
-    "limit": 3,
-    "limit_display": "3",
-    "message": "Repository limit reached (3/3). Upgrade to add more repositories.",
-    "tier": "free",
-    "error_code": "REPO_LIMIT_REACHED"
-  }
-}
-```
-
-**Error Codes**:
-| Code | HTTP Status | Description |
-|------|-------------|-------------|
-| `REPO_LIMIT_REACHED` | 403 | Max repos for tier |
-| `REPO_TOO_LARGE` | 400 | File/function count exceeds tier |
-| `RATE_LIMIT_EXCEEDED` | 429 | Playground daily limit |
-| `INVALID_USER` | 400 | Invalid or missing user_id |
-| `SYSTEM_ERROR` | 500 | Database/system failure |
-
----
-
-## 6. Database Schema
-
-### user_profiles (NEW)
-```sql
-CREATE TABLE user_profiles (
-    id UUID PRIMARY KEY,
-    user_id UUID REFERENCES auth.users(id),
-    tier TEXT DEFAULT 'free',  -- 'free', 'pro', 'enterprise'
-    created_at TIMESTAMPTZ,
-    updated_at TIMESTAMPTZ
-);
-```
-
-**Security Notes:**
-- RLS enabled with SELECT/INSERT for authenticated users
-- NO UPDATE policy for users (prevents self-upgrade)
-- Tier updates only via service role key (payment webhooks)
-
-### repositories (existing, no changes needed)
-Already has `user_id` column for ownership.
-
----
-
-## 7. Fail-Safe Behavior
-
-| Scenario | Behavior | Reason |
-|----------|----------|--------|
-| DB down during `check_repo_count` | **DENY** (fail-closed) | Prevent unlimited repos |
-| DB down during `get_usage_summary` | Return defaults | Read-only, safe to fail-open |
-| Redis cache miss | Query DB | Graceful degradation |
-| Redis down | Continue without cache | Non-critical |
-| Invalid user_id | Return FREE limits | Safe default |
-
----
-
-## 8. Redis Keys
-
-| Key Pattern | TTL | Description |
-|-------------|-----|-------------|
-| `playground:rate:{ip}` | 24h | Playground search count |
-| `user:tier:{user_id}` | 5min | Cached user tier |
-
----
-
-## 9. Frontend Integration Points
-
-### Dashboard
-- Show usage bar: "2/3 repositories"
-- Show tier badge: "Free Tier"
-- Upgrade CTA when near limits
-
-### Add Repository Flow
-1. Call `GET /users/limits/check-repo-add`
-2. If `allowed: false`, show upgrade modal
-3. If `allowed: true`, proceed with add
-
-### Playground
-1. Read rate limit headers from search responses
-2. Show remaining searches: "47/50 searches today"
-3. Show progressive CTAs at thresholds
-4. On 429, show signup modal
-
----
-
-## 10. Migration Path
-
-### Existing Users
-All existing users default to `free` tier. Migration auto-creates profile on first API call.
-
-### Existing Repos
-No changes needed. Limit checks only apply to NEW repos.
-
----
-
-## 11. Implementation Order
-
-| Phase | Issue | Priority | Depends On |
-|-------|-------|----------|------------|
-| 1 | #96 User tier system | P0 | - | ✅ DONE |
-| 2 | #94 Repo size limits | P0 | #96 |
-| 2 | #95 Repo count limits | P0 | #96 |
-| 3 | #93 Playground rate limit | P1 | Redis |
-| 4 | #97 Progressive CTAs | P2 | #93 |
-
----
-
-## 12. Open Questions
-
-1. **Upgrade Flow**: Stripe integration? Manual for now?
-2. **Existing Large Repos**: Grandfather them or enforce limits?
-3. **Team/Org Support**: Future consideration for enterprise?
-4. **API Key Users**: Same limits as JWT users?
-
----
-
-## 13. Files to Create/Modify
-
-### Create
-- [x] `backend/services/user_limits.py`
-- [x] `backend/routes/users.py`
-- [x] `supabase/migrations/001_user_profiles.sql`
-- [ ] `backend/services/playground_rate_limiter.py`
-- [ ] `frontend/src/hooks/usePlaygroundUsage.ts`
-- [ ] `frontend/src/components/PlaygroundCTA.tsx`
-- [ ] `frontend/src/components/UsageBar.tsx`
-
-### Modify
-- [x] `backend/dependencies.py`
-- [x] `backend/main.py`
-- [ ] `backend/routes/repos.py` - Add limit checks
-- [ ] `backend/routes/playground.py` - Add rate limiting
-- [ ] `frontend/src/pages/Dashboard.tsx` - Show usage
-- [ ] `frontend/src/pages/LandingPage.tsx` - Show CTAs
diff --git a/legacy/IndexingProgress.tsx b/legacy/IndexingProgress.tsx
deleted file mode 100644
index 76eebfd..0000000
--- a/legacy/IndexingProgress.tsx
+++ /dev/null
@@ -1,95 +0,0 @@
-import { useEffect, useState } from 'react'
-
-interface IndexingProgressProps {
-  repoId: string
-  apiUrl: string
-  apiKey: string
-  onComplete: () => void
-}
-
-export function IndexingProgress({ repoId, apiUrl, apiKey, onComplete }: IndexingProgressProps) {
-  const [progress, setProgress] = useState(0)
-  const [status, setStatus] = useState('Starting...')
-  const [stats, setStats] = useState({ processed: 0, total: 0, functions: 0 })
-
-  useEffect(() => {
-    let interval: any
-    
-    const checkProgress = async () => {
-      try {
-        const response = await fetch(`${apiUrl}/api/repos/${repoId}`, {
-          headers: { 'Authorization': `Bearer ${apiKey}` }
-        })
-        const repo = await response.json()
-        
-        if (repo.status === 'indexed') {
-          setProgress(100)
-          setStatus('✅ Indexing complete!')
-          clearInterval(interval)
-          setTimeout(onComplete, 1500)
-        } else if (repo.status === 'indexing') {
-          // Estimate progress based on function count growth
-          const estimatedProgress = Math.min(95, (repo.file_count / 100) * 100)
-          setProgress(estimatedProgress)
-          setStatus(`📊 Indexing... ${repo.file_count} functions processed`)
-          setStats({
-            processed: repo.file_count,
-            total: 100,
-            functions: repo.file_count
-          })
-        }
-      } catch (error) {
-        console.error('Error checking progress:', error)
-      }
-    }
-    
-    // Check immediately, then every 2 seconds
-    checkProgress()
-    interval = setInterval(checkProgress, 2000)
-    
-    return () => clearInterval(interval)
-  }, [repoId])
-
-  return (
-    <div className="fixed inset-0 bg-black/50 flex items-center justify-center z-50">
-      <div className="bg-white rounded-lg shadow-xl p-8 max-w-md w-full mx-4">
-        <h3 className="text-xl font-semibold mb-6 text-gray-900">
-          Indexing Repository
-        </h3>
-        
-        <div className="space-y-4">
-          <div className="flex items-center justify-between text-sm mb-2">
-            <span className="text-gray-600">{status}</span>
-            <span className="font-semibold text-blue-600">{progress.toFixed(0)}%</span>
-          </div>
-          
-          {/* Progress Bar */}
-          <div className="w-full bg-gray-200 rounded-full h-3 overflow-hidden">
-            <div 
-              className="h-full bg-gradient-to-r from-blue-500 to-blue-600 transition-all duration-500 ease-out"
-              style={{ width: `${progress}%` }}
-            />
-          </div>
-          
-          {/* Stats */}
-          {stats.functions > 0 && (
-            <div className="grid grid-cols-2 gap-3 mt-4 pt-4 border-t border-gray-200">
-              <div>
-                <div className="text-xs text-gray-500">Functions Found</div>
-                <div className="text-2xl font-bold text-gray-900">{stats.functions}</div>
-              </div>
-              <div>
-                <div className="text-xs text-gray-500">Status</div>
-                <div className="text-sm font-semibold text-blue-600">Processing...</div>
-              </div>
-            </div>
-          )}
-          
-          <p className="text-xs text-gray-500 mt-4">
-            Using batch processing for optimal performance
-          </p>
-        </div>
-      </div>
-    </div>
-  )
-}
diff --git a/legacy/README.md b/legacy/README.md
deleted file mode 100644
index 6bb1569..0000000
--- a/legacy/README.md
+++ /dev/null
@@ -1,23 +0,0 @@
-# Legacy Code Archive
-
-This folder contains old implementations that were replaced during development.
-
-## Files:
-
-### indexer_old.py
-- Original indexer implementation before batch processing optimization
-- Replaced by `indexer_optimized.py` which achieves 100x performance improvement
-- Kept for reference on the evolution from individual API calls to batch processing
-
-### repo_manager_old.py  
-- Original repository manager with in-memory storage
-- Replaced by current `repo_manager.py` with Supabase persistence
-- Shows the migration from ephemeral to production-grade storage
-
-### IndexingProgress.tsx
-- Original indexing progress component using WebSocket
-- Replaced by integrated progress in `RepoOverview.tsx` using shadcn Progress component
-- Kept for reference on the WebSocket implementation approach
-
-**Note:** These files are not imported or used anywhere in the active codebase.
-They're preserved for historical reference and to show the development evolution.
diff --git a/legacy/indexer_old.py b/legacy/indexer_old.py
deleted file mode 100644
index 78fb332..0000000
--- a/legacy/indexer_old.py
+++ /dev/null
@@ -1,362 +0,0 @@
-"""
-Code Indexer
-Handles code parsing, embedding generation, and semantic search
-"""
-import os
-from pathlib import Path
-from typing import List, Dict, Optional
-import asyncio
-
-# Tree-sitter for parsing
-import tree_sitter_python as tspython
-import tree_sitter_javascript as tsjavascript
-from tree_sitter import Language, Parser
-
-# AI/ML
-from openai import AsyncOpenAI
-from pinecone import Pinecone, ServerlessSpec
-
-# Utils
-import hashlib
-from dotenv import load_dotenv
-
-# Import cache service
-from services.cache import CacheService
-
-load_dotenv()
-
-
-class CodeIndexer:
-    """Index and search code using semantic embeddings"""
-    
-    def __init__(self):
-        # Initialize cache
-        self.cache = CacheService()
-        
-        # Initialize OpenAI
-        self.openai_client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))
-        
-        # Initialize Pinecone
-        pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
-        
-        index_name = os.getenv("PINECONE_INDEX_NAME", "codeintel")
-        
-        # Create index if it doesn't exist
-        if index_name not in pc.list_indexes().names():
-            print(f"Creating Pinecone index: {index_name}")
-            pc.create_index(
-                name=index_name,
-                dimension=1536,  # OpenAI embedding dimension
-                metric="cosine",
-                spec=ServerlessSpec(
-                    cloud="aws",
-                    region="us-east-1"
-                )
-            )
-        
-        self.index = pc.Index(index_name)
-        
-        # Initialize tree-sitter parsers
-        self.parsers = {
-            'python': self._create_parser(Language(tspython.language())),
-            'javascript': self._create_parser(Language(tsjavascript.language())),
-            'typescript': self._create_parser(Language(tsjavascript.language())),
-        }
-        
-        print("CodeIndexer initialized!")
-    
-    def _create_parser(self, language) -> Parser:
-        """Create a tree-sitter parser"""
-        parser = Parser(language)
-        return parser
-    
-    def _detect_language(self, file_path: str) -> Optional[str]:
-        """Detect programming language from file extension"""
-        ext = Path(file_path).suffix.lower()
-        lang_map = {
-            '.py': 'python',
-            '.js': 'javascript',
-            '.jsx': 'javascript',
-            '.ts': 'typescript',
-            '.tsx': 'typescript',
-        }
-        return lang_map.get(ext)
-    
-    def _discover_code_files(self, repo_path: str) -> List[Path]:
-        """Find all code files in repository"""
-        repo_path = Path(repo_path)
-        code_files = []
-        
-        # Extensions to index
-        extensions = {'.py', '.js', '.jsx', '.ts', '.tsx'}
-        
-        # Directories to skip
-        skip_dirs = {'node_modules', '.git', '__pycache__', 'venv', 'env', 'dist', 'build'}
-        
-        for file_path in repo_path.rglob('*'):
-            # Skip directories
-            if file_path.is_dir():
-                continue
-            
-            # Skip if in excluded directory
-            if any(skip in file_path.parts for skip in skip_dirs):
-                continue
-            
-            # Check extension
-            if file_path.suffix in extensions:
-                code_files.append(file_path)
-        
-        return code_files
-    
-    async def _create_embedding(self, text: str) -> List[float]:
-        """Generate embedding using OpenAI with caching"""
-        try:
-            # Truncate if too long
-            text = text[:8000]
-            
-            # Check cache first
-            cached = self.cache.get_embedding(text)
-            if cached:
-                return cached
-            
-            # Generate new embedding
-            response = await self.openai_client.embeddings.create(
-                model="text-embedding-3-small",
-                input=text
-            )
-            embedding = response.data[0].embedding
-            
-            # Cache it
-            self.cache.set_embedding(text, embedding)
-            
-            return embedding
-        except Exception as e:
-            print(f"Error creating embedding: {e}")
-            return [0.0] * 1536
-    
-    def _extract_functions(self, tree_node, source_code: bytes) -> List[Dict]:
-        """Extract function/class definitions from AST"""
-        functions = []
-        
-        # Function/class node types
-        target_types = {
-            'function_definition',
-            'class_definition',
-            'function_declaration',
-            'method_definition',
-            'arrow_function',
-        }
-        
-        if tree_node.type in target_types:
-            # Extract function name
-            name_node = None
-            for child in tree_node.children:
-                if child.type == 'identifier':
-                    name_node = child
-                    break
-            
-            name = source_code[name_node.start_byte:name_node.end_byte].decode('utf-8') if name_node else 'anonymous'
-            
-            code = source_code[tree_node.start_byte:tree_node.end_byte].decode('utf-8')
-            
-            functions.append({
-                'name': name,
-                'type': tree_node.type,
-                'code': code,
-                'start_line': tree_node.start_point[0],
-                'end_line': tree_node.end_point[0],
-            })
-        
-        # Recursively search children
-        for child in tree_node.children:
-            functions.extend(self._extract_functions(child, source_code))
-        
-        return functions
-    
-    async def index_repository(self, repo_id: str, repo_path: str):
-        """Index all code in a repository"""
-        print(f"Indexing repository: {repo_id} at {repo_path}")
-        
-        # Discover code files
-        code_files = self._discover_code_files(repo_path)
-        print(f"Found {len(code_files)} code files")
-        
-        # Process files in batches
-        batch_size = 5
-        total_functions = 0
-        
-        for i in range(0, len(code_files), batch_size):
-            batch = code_files[i:i + batch_size]
-            results = await asyncio.gather(
-                *[self._index_file(repo_id, str(file_path)) for file_path in batch],
-                return_exceptions=True
-            )
-            
-            for result in results:
-                if isinstance(result, int):
-                    total_functions += result
-            
-            print(f"Processed {i + len(batch)}/{len(code_files)} files, {total_functions} functions indexed")
-        
-        print(f"Indexing complete! Total functions: {total_functions}")
-        return total_functions
-    
-    async def _index_file(self, repo_id: str, file_path: str) -> int:
-        """Index a single file"""
-        try:
-            # Detect language
-            language = self._detect_language(file_path)
-            if not language or language not in self.parsers:
-                return 0
-            
-            # Read file
-            with open(file_path, 'rb') as f:
-                source_code = f.read()
-            
-            # Parse with tree-sitter
-            tree = self.parsers[language].parse(source_code)
-            
-            # Extract functions
-            functions = self._extract_functions(tree.root_node, source_code)
-            
-            if not functions:
-                return 0
-            
-            # Generate embeddings and store in Pinecone
-            vectors_to_upsert = []
-            
-            for func in functions:
-                # Create text for embedding
-                embedding_text = f"Function: {func['name']}\nType: {func['type']}\n\n{func['code']}"
-                
-                # Generate embedding
-                embedding = await self._create_embedding(embedding_text)
-                
-                # Create unique ID
-                func_id = hashlib.md5(f"{repo_id}:{file_path}:{func['start_line']}".encode()).hexdigest()
-                
-                # Prepare vector
-                vectors_to_upsert.append({
-                    "id": func_id,
-                    "values": embedding,
-                    "metadata": {
-                        "repo_id": repo_id,
-                        "file_path": file_path,
-                        "name": func['name'],
-                        "type": func['type'],
-                        "code": func['code'][:1000],  # Limit code length in metadata
-                        "start_line": func['start_line'],
-                        "end_line": func['end_line'],
-                        "language": language
-                    }
-                })
-            
-            # Upsert to Pinecone
-            if vectors_to_upsert:
-                self.index.upsert(vectors=vectors_to_upsert)
-            
-            return len(functions)
-            
-        except Exception as e:
-            print(f"Error indexing file {file_path}: {e}")
-            return 0
-    
-    async def semantic_search(
-        self,
-        query: str,
-        repo_id: str,
-        max_results: int = 10
-    ) -> List[Dict]:
-        """Search code using semantic similarity with caching"""
-        try:
-            # Check cache first
-            cached_results = self.cache.get_search_results(query, repo_id)
-            if cached_results:
-                print(f"✅ Cache HIT for query: {query[:50]}")
-                return cached_results
-            
-            print(f"❌ Cache MISS for query: {query[:50]}")
-            
-            # Generate query embedding (this will use embedding cache)
-            query_embedding = await self._create_embedding(query)
-            
-            # Search Pinecone
-            results = self.index.query(
-                vector=query_embedding,
-                filter={"repo_id": {"$eq": repo_id}},
-                top_k=max_results,
-                include_metadata=True
-            )
-            
-            # Format results
-            formatted_results = []
-            for match in results.matches:
-                formatted_results.append({
-                    "code": match.metadata.get("code", ""),
-                    "file_path": match.metadata.get("file_path", ""),
-                    "name": match.metadata.get("name", ""),
-                    "type": match.metadata.get("type", ""),
-                    "language": match.metadata.get("language", ""),
-                    "score": float(match.score),
-                    "line_start": match.metadata.get("start_line", 0),
-                    "line_end": match.metadata.get("end_line", 0),
-                })
-            
-            # Cache results
-            self.cache.set_search_results(query, repo_id, formatted_results)
-            
-            return formatted_results
-            
-        except Exception as e:
-
-            print(f"Error searching: {e}")
-            return []
-    
-    async def explain_code(
-        self,
-        repo_id: str,
-        file_path: str,
-        function_name: Optional[str] = None
-    ) -> str:
-        """Generate natural language explanation of code using Claude"""
-        try:
-            # Read the file
-            with open(file_path, 'r') as f:
-                code_content = f.read()
-            
-            # If function_name provided, try to find it
-            if function_name:
-                language = self._detect_language(file_path)
-                if language and language in self.parsers:
-                    tree = self.parsers[language].parse(code_content.encode('utf-8'))
-                    functions = self._extract_functions(tree.root_node, code_content.encode('utf-8'))
-                    
-                    # Find matching function
-                    for func in functions:
-                        if func['name'] == function_name:
-                            code_content = func['code']
-                            break
-            
-            # Use OpenAI to explain (we could use Claude API too)
-            response = await self.openai_client.chat.completions.create(
-                model="gpt-4o-mini",  # Cheaper and faster
-                messages=[
-                    {
-                        "role": "system",
-                        "content": "You are a helpful code explainer. Explain code clearly and concisely, focusing on what it does, how it works, and any important patterns or techniques used."
-                    },
-                    {
-                        "role": "user",
-                        "content": f"Explain this code:\n\n```\n{code_content}\n```"
-                    }
-                ],
-                max_tokens=1000,
-                temperature=0.3
-            )
-            
-            explanation = response.choices[0].message.content
-            return explanation
-            
-        except Exception as e:
-            print(f"Error explaining code: {e}")
-            return f"Error generating explanation: {str(e)}"
diff --git a/legacy/repo_manager_old.py b/legacy/repo_manager_old.py
deleted file mode 100644
index f2c9301..0000000
--- a/legacy/repo_manager_old.py
+++ /dev/null
@@ -1,125 +0,0 @@
-"""
-Repository Manager
-Handles repository CRUD operations (in-memory for MVP, later DB)
-"""
-import uuid
-from typing import Dict, List, Optional
-import os
-import git
-from pathlib import Path
-
-
-class RepositoryManager:
-    """Manage repositories"""
-    
-    def __init__(self):
-        # In-memory storage (Phase 1 MVP)
-        # Later: replace with PostgreSQL
-        self.repos: Dict[str, dict] = {}
-        self.repos_dir = Path("./repos")
-        self.repos_dir.mkdir(exist_ok=True)
-        
-        # Discover existing repositories on startup
-        self._discover_existing_repos()
-    
-    def _discover_existing_repos(self):
-        """Scan repos directory and load existing repositories"""
-        if not self.repos_dir.exists():
-            return
-        
-        for repo_path in self.repos_dir.iterdir():
-            if not repo_path.is_dir() or repo_path.name.startswith('.'):
-                continue
-            
-            try:
-                # Try to open as git repo
-                repo = git.Repo(repo_path)
-                
-                # Get repo info from git config
-                remote_url = None
-                if repo.remotes:
-                    remote_url = repo.remotes.origin.url
-                
-                # Extract name from URL or use folder name
-                name = remote_url.split('/')[-1].replace('.git', '') if remote_url else repo_path.name
-                branch = repo.active_branch.name if not repo.head.is_detached else "main"
-                
-                # Count code files to estimate if indexed
-                code_files = list(repo_path.rglob('*.py')) + list(repo_path.rglob('*.js')) + list(repo_path.rglob('*.ts'))
-                file_count = len([f for f in code_files if '.git' not in str(f) and 'node_modules' not in str(f)])
-                
-                # Add to repos
-                self.repos[repo_path.name] = {
-                    "id": repo_path.name,
-                    "name": name,
-                    "git_url": remote_url or "unknown",
-                    "branch": branch,
-                    "local_path": str(repo_path),
-                    "status": "indexed",
-                    "file_count": file_count * 20,
-                    "last_indexed_commit": repo.head.commit.hexsha  # Track commit!
-                }
-                
-                print(f"✅ Discovered existing repo: {name} ({repo_path.name}) - ~{file_count} files")
-                
-            except Exception as e:
-                print(f"⚠️  Skipping {repo_path.name}: {e}")
-    
-    def list_repos(self) -> List[dict]:
-        """List all repositories"""
-        return list(self.repos.values())
-    
-    def get_repo(self, repo_id: str) -> Optional[dict]:
-        """Get repository by ID"""
-        return self.repos.get(repo_id)
-    
-    def add_repo(self, name: str, git_url: str, branch: str = "main") -> dict:
-        """Add a new repository"""
-        repo_id = str(uuid.uuid4())
-        local_path = self.repos_dir / repo_id
-        
-        try:
-            # Clone the repository
-            print(f"Cloning {git_url} to {local_path}...")
-            git.Repo.clone_from(git_url, local_path, branch=branch, depth=1)
-            
-            repo = {
-                "id": repo_id,
-                "name": name,
-                "git_url": git_url,
-                "branch": branch,
-                "local_path": str(local_path),
-                "status": "cloned",
-                "file_count": 0
-            }
-            
-            self.repos[repo_id] = repo
-            return repo
-            
-        except Exception as e:
-            # Cleanup on failure
-            if local_path.exists():
-                import shutil
-                shutil.rmtree(local_path)
-            raise Exception(f"Failed to clone repository: {str(e)}")
-    
-    def update_status(self, repo_id: str, status: str):
-        """Update repository status"""
-        if repo_id in self.repos:
-            self.repos[repo_id]["status"] = status
-    
-    def update_file_count(self, repo_id: str, count: int):
-        """Update file count"""
-        if repo_id in self.repos:
-            self.repos[repo_id]["file_count"] = count
-
-    def get_last_indexed_commit(self, repo_id: str) -> str:
-        """Get last indexed commit SHA"""
-        if repo_id in self.repos:
-            return self.repos[repo_id].get("last_indexed_commit", "")
-        return ""
-    
-    def update_last_commit(self, repo_id: str, commit_sha: str):
-        """Update last indexed commit"""
-        if repo_id in self.repos:
-            self.repos[repo_id]["last_indexed_commit"] = commit_sha

From c8f721f379e0e8b0729a87adf2ddd217b2c8e586 Mon Sep 17 00:00:00 2001
From: Devanshu Rajesh Chicholikar <chicholikar.d@northeastern.edu>
Date: Thu, 8 Jan 2026 17:27:52 -0500
Subject: [PATCH 2/4] chore: reorganize docs - move deployment guides to docs/
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Move DEPLOYMENT.md → docs/deployment.md
- Move DOCKER_QUICKSTART.md → docs/docker-quickstart.md
- Move DOCKER_TROUBLESHOOTING.md → docs/docker-troubleshooting.md
- Rename MCP_SETUP.md → docs/mcp-setup.md (consistent naming)

Cleaner root directory, all guides in one place.

Part of #180
---
 DEPLOYMENT.md => docs/deployment.md                         | 0
 DOCKER_QUICKSTART.md => docs/docker-quickstart.md           | 0
 DOCKER_TROUBLESHOOTING.md => docs/docker-troubleshooting.md | 0
 docs/{MCP_SETUP.md => mcp-setup.md}                         | 0
 4 files changed, 0 insertions(+), 0 deletions(-)
 rename DEPLOYMENT.md => docs/deployment.md (100%)
 rename DOCKER_QUICKSTART.md => docs/docker-quickstart.md (100%)
 rename DOCKER_TROUBLESHOOTING.md => docs/docker-troubleshooting.md (100%)
 rename docs/{MCP_SETUP.md => mcp-setup.md} (100%)

diff --git a/DEPLOYMENT.md b/docs/deployment.md
similarity index 100%
rename from DEPLOYMENT.md
rename to docs/deployment.md
diff --git a/DOCKER_QUICKSTART.md b/docs/docker-quickstart.md
similarity index 100%
rename from DOCKER_QUICKSTART.md
rename to docs/docker-quickstart.md
diff --git a/DOCKER_TROUBLESHOOTING.md b/docs/docker-troubleshooting.md
similarity index 100%
rename from DOCKER_TROUBLESHOOTING.md
rename to docs/docker-troubleshooting.md
diff --git a/docs/MCP_SETUP.md b/docs/mcp-setup.md
similarity index 100%
rename from docs/MCP_SETUP.md
rename to docs/mcp-setup.md

From 48de6b0d606a7b953c25b80a15eec4f889b18a6f Mon Sep 17 00:00:00 2001
From: Devanshu Rajesh Chicholikar <chicholikar.d@northeastern.edu>
Date: Thu, 8 Jan 2026 17:28:39 -0500
Subject: [PATCH 3/4] chore: add developer config files

- Add .nvmrc (Node 20)
- Add .python-version (Python 3.11)
- Add .editorconfig (consistent code style)
- Add .github/dependabot.yml (automated dependency updates)

Makes contributor setup easier and keeps dependencies fresh.

Part of #180
---
 .editorconfig          | 21 +++++++++++++++++++++
 .github/dependabot.yml | 34 ++++++++++++++++++++++++++++++++++
 .nvmrc                 |  1 +
 .python-version        |  1 +
 4 files changed, 57 insertions(+)
 create mode 100644 .editorconfig
 create mode 100644 .github/dependabot.yml
 create mode 100644 .nvmrc
 create mode 100644 .python-version

diff --git a/.editorconfig b/.editorconfig
new file mode 100644
index 0000000..b78aca5
--- /dev/null
+++ b/.editorconfig
@@ -0,0 +1,21 @@
+# EditorConfig helps maintain consistent coding styles
+# https://editorconfig.org
+
+root = true
+
+[*]
+indent_style = space
+indent_size = 2
+end_of_line = lf
+charset = utf-8
+trim_trailing_whitespace = true
+insert_final_newline = true
+
+[*.py]
+indent_size = 4
+
+[*.md]
+trim_trailing_whitespace = false
+
+[Makefile]
+indent_style = tab
diff --git a/.github/dependabot.yml b/.github/dependabot.yml
new file mode 100644
index 0000000..26ad928
--- /dev/null
+++ b/.github/dependabot.yml
@@ -0,0 +1,34 @@
+version: 2
+updates:
+  # Python dependencies
+  - package-ecosystem: "pip"
+    directory: "/backend"
+    schedule:
+      interval: "weekly"
+    commit-message:
+      prefix: "chore(deps)"
+    labels:
+      - "dependencies"
+      - "python"
+
+  # JavaScript dependencies
+  - package-ecosystem: "npm"
+    directory: "/frontend"
+    schedule:
+      interval: "weekly"
+    commit-message:
+      prefix: "chore(deps)"
+    labels:
+      - "dependencies"
+      - "javascript"
+
+  # GitHub Actions
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "monthly"
+    commit-message:
+      prefix: "chore(deps)"
+    labels:
+      - "dependencies"
+      - "ci"
diff --git a/.nvmrc b/.nvmrc
new file mode 100644
index 0000000..209e3ef
--- /dev/null
+++ b/.nvmrc
@@ -0,0 +1 @@
+20
diff --git a/.python-version b/.python-version
new file mode 100644
index 0000000..2c07333
--- /dev/null
+++ b/.python-version
@@ -0,0 +1 @@
+3.11

From ca7ed959d2c5159ec3c74dbbe9f0a0f70eeef431 Mon Sep 17 00:00:00 2001
From: Devanshu Rajesh Chicholikar <chicholikar.d@northeastern.edu>
Date: Thu, 8 Jan 2026 17:33:06 -0500
Subject: [PATCH 4/4] docs: complete README overhaul and fix internal links
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

README.md:
- Rename to OpenCodeIntel (consistent branding)
- Add badges (CI, license, release)
- Add Quick Links navigation
- Remove emoji headers
- Cleaner structure: features → quickstart → docs
- Leave placeholder for logo and demo screenshot
- Concise, no fluff

CONTRIBUTING.md:
- Update repo name to opencodeintel
- Remove emoji

docs/:
- Fix cross-references to renamed files
- Convert to proper markdown links

Part of #180
---
 CONTRIBUTING.md                |  10 +-
 README.md                      | 308 ++++++++-------------------------
 docs/docker-quickstart.md      |   8 +-
 docs/docker-troubleshooting.md |   2 +-
 4 files changed, 86 insertions(+), 242 deletions(-)

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 8f63c02..11f7191 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,13 +1,13 @@
-# Contributing to CodeIntel
+# Contributing to OpenCodeIntel
 
-First off, thanks for considering contributing! CodeIntel is better because of people like you.
+Thanks for considering contributing! OpenCodeIntel is better because of people like you.
 
 ## Quick Start
 
 ```bash
 # Fork the repo, then clone
-git clone https://github.com/YOUR_USERNAME/codeintel-mcp
-cd codeintel-mcp
+git clone https://github.com/YOUR_USERNAME/opencodeintel
+cd opencodeintel
 
 # Set up backend
 cd backend
@@ -138,4 +138,4 @@ Be respectful, constructive, and collaborative. We're all here to build somethin
 
 ---
 
-**Thanks for contributing! 🚀**
+**Thanks for contributing!**
diff --git a/README.md b/README.md
index a160343..a38b5eb 100644
--- a/README.md
+++ b/README.md
@@ -1,154 +1,99 @@
-# CodeIntel MCP
+<p align="center">
+  <!-- Logo placeholder - replace with actual logo -->
+  <h1 align="center">OpenCodeIntel</h1>
+</p>
+
+<p align="center">
+  <strong>AI-powered semantic code search for your repositories</strong>
+</p>
+
+<p align="center">
+  <a href="https://github.com/OpenCodeIntel/opencodeintel/actions/workflows/ci.yml">
+    <img src="https://github.com/OpenCodeIntel/opencodeintel/actions/workflows/ci.yml/badge.svg" alt="CI Status" />
+  </a>
+  <a href="https://github.com/OpenCodeIntel/opencodeintel/blob/main/LICENSE">
+    <img src="https://img.shields.io/github/license/OpenCodeIntel/opencodeintel" alt="License" />
+  </a>
+  <a href="https://github.com/OpenCodeIntel/opencodeintel/releases">
+    <img src="https://img.shields.io/github/v/release/OpenCodeIntel/opencodeintel?include_prereleases" alt="Release" />
+  </a>
+</p>
+
+<p align="center">
+  <a href="#quick-start">Quick Start</a> •
+  <a href="./docs/deployment.md">Deployment</a> •
+  <a href="./docs/mcp-setup.md">MCP Integration</a> •
+  <a href="./CONTRIBUTING.md">Contributing</a>
+</p>
 
-**MCP server for AI-powered codebase intelligence.** Semantic search, dependency analysis, and impact prediction for your repositories.
-
-## The Problem
-
-AI coding assistants are powerful, but they're flying blind in large codebases:
-- Can't semantically search across thousands of files
-- Don't understand dependency relationships
-- Can't predict what breaks when you change a file
-- Have no context on team coding patterns
-
-## The Solution
-
-CodeIntel is an MCP (Model Context Protocol) server that gives AI agents deep codebase understanding:
-
-```typescript
-// Ask Claude (via MCP):
-"Find authentication middleware in this repo"
-
-// CodeIntel semantically searches 10,000+ functions
-// Returns exact implementations, not keyword matches
-```
-
-**Built for production. Not a demo.**
-
-## Key Features
-
-### 🔍 Semantic Code Search
-Search by meaning, not keywords. Find `"error handling logic"` even if functions are named `processFailure()`.
-
-### 📊 Dependency Analysis  
-Visualize your entire codebase architecture. See which files are critical, which are isolated, and how everything connects.
+---
 
-### ⚡ Impact Prediction
-Before changing a file, know exactly what breaks:
-```
-src/auth/middleware.py
-└─ 15 files affected (HIGH RISK)
-   ├─ src/api/routes.py
-   ├─ src/services/user.py
-   └─ ... + 12 more
-```
+<!-- Demo screenshot placeholder -->
+<!-- <p align="center">
+  <img src="docs/assets/demo.png" alt="OpenCodeIntel Dashboard" width="800" />
+</p> -->
 
-### 🎨 Code Style Analysis
-Understand team patterns: naming conventions (camelCase vs snake_case), async adoption %, type hint usage.
+## What is OpenCodeIntel?
 
-### 🚀 Performance That Scales
+OpenCodeIntel gives AI coding assistants deep understanding of your codebase. It's an MCP server that provides semantic code search, dependency analysis, and impact prediction.
 
-**Batch Processing:** 100x faster indexing
-- Before: 40+ min for 1,000 functions (individual API calls)
-- After: 22.9 sec (batch embedding requests)
+**Search by meaning, not keywords.** Find "error handling logic" even when functions are named `processFailure()`.
 
-**Incremental Indexing:** 700x faster re-indexing  
-- Full re-index: 51.4s
-- Incremental (git diff): 0.07s
-- Perfect for active development
+## Features
 
-**Supabase Caching:** 5x search speedup
-- Cold search: 800ms
-- Cached: 150ms
+- **Semantic Search** - Vector-based code search that understands intent
+- **Dependency Graph** - Visualize how your codebase connects
+- **Impact Analysis** - Know what breaks before you change a file
+- **Code Style Analysis** - Understand team patterns and conventions
+- **MCP Integration** - Works directly with Claude Desktop
 
 ## Quick Start
 
-### 🐳 Docker (Recommended)
-
-**Fastest way to get started:**
+### Using Docker (Recommended)
 
 ```bash
-# 1. Clone repo
 git clone https://github.com/OpenCodeIntel/opencodeintel.git
 cd opencodeintel
 
-# 2. Configure environment
 cp .env.example .env
-# Edit .env with your API keys
+# Add your API keys to .env
 
-# 3. Start everything
 docker compose up -d
-
-# Frontend: http://localhost:3000
-# Backend: http://localhost:8000
-# Docs: http://localhost:8000/docs
 ```
 
-**Full guide:** [`DOCKER_QUICKSTART.md`](./DOCKER_QUICKSTART.md)  
-**Troubleshooting:** [`DOCKER_TROUBLESHOOTING.md`](./DOCKER_TROUBLESHOOTING.md)
+- Frontend: http://localhost:3000
+- Backend: http://localhost:8000
+- API Docs: http://localhost:8000/docs
 
----
-
-### 📦 Manual Setup
+### Manual Setup
 
-### Prerequisites
-- Python 3.11+
-- Node.js 20+
-- OpenAI API key
-- Pinecone account
-- Supabase project
-
-### 1. Clone & Setup Backend
+**Requirements:** Python 3.11+, Node.js 20+
 
 ```bash
+# Backend
 cd backend
 python -m venv venv
-source venv/bin/activate  # Windows: venv\Scripts\activate
+source venv/bin/activate
 pip install -r requirements.txt
-
-# Configure .env
 cp .env.example .env
-# Add your API keys to .env
-```
-
-### 2. Run Backend
-
-```bash
 python main.py
-# Server runs on http://localhost:8000
-```
 
-### 3. Setup Frontend
-
-```bash
+# Frontend (new terminal)
 cd frontend
 npm install
 npm run dev
-# UI at http://localhost:5173
-```
-
-### 4. Add a Repository
-
-```bash
-# Via API
-curl -X POST http://localhost:8000/api/repos \
-  -H "Authorization: Bearer dev-secret-key" \
-  -H "Content-Type: application/json" \
-  -d '{"name": "zustand", "git_url": "https://github.com/pmndrs/zustand"}'
-
-# Or use the web UI
 ```
 
 ## MCP Integration
 
-CodeIntel works as an MCP server with Claude Desktop. **[📚 Full MCP Setup Guide](./docs/MCP_SETUP.md)**
+Connect OpenCodeIntel to Claude Desktop for AI-powered code assistance.
 
-**Quick Setup:**
+Add to your Claude Desktop config:
 
 ```json
-// Add to Claude Desktop config
 {
   "mcpServers": {
-    "codeintel": {
+    "opencodeintel": {
       "command": "python",
       "args": ["/path/to/opencodeintel/mcp-server/server.py"],
       "env": {
@@ -160,142 +105,41 @@ CodeIntel works as an MCP server with Claude Desktop. **[📚 Full MCP Setup Gui
 }
 ```
 
-**Available MCP Tools:**
-| Tool | Description |
-|------|-------------|
-| `search_code` | Semantic code search - finds code by meaning |
-| `list_repositories` | View all indexed repos |
-| `get_dependency_graph` | Visualize architecture and file connections |
-| `analyze_code_style` | Team conventions and patterns |
-| `analyze_impact` | Know what breaks before you change it |
-| `get_repository_insights` | High-level codebase overview |
-
-Now ask Claude: *"What's the authentication logic in the user service?"* and it searches your actual codebase.
+**Available tools:** `search_code`, `list_repositories`, `get_dependency_graph`, `analyze_code_style`, `analyze_impact`, `get_repository_insights`
 
-**[→ Complete setup guide with troubleshooting](./docs/MCP_SETUP.md)**
+See [MCP Setup Guide](./docs/mcp-setup.md) for detailed instructions.
 
 ## Architecture
 
 ```
-┌─────────────┐
-│   Frontend  │  React + TypeScript + Tailwind
-│  (Vite app) │  Dependency graphs, search UI
-└──────┬──────┘
-       │
-┌──────▼──────┐
-│   FastAPI   │  Python backend
-│   Backend   │  /api/search, /api/repos/{id}/dependencies
-└──────┬──────┘
-       │
-       ├─────► Pinecone (vector search)
-       ├─────► OpenAI (embeddings)
-       ├─────► Supabase (persistence)
-       └─────► Redis (caching)
-```
-
-**Tech Stack:**
-- **Backend:** FastAPI, tree-sitter (AST parsing), OpenAI embeddings
-- **Vector DB:** Pinecone for semantic search
-- **Database:** Supabase (PostgreSQL) for metadata + caching
-- **Cache:** Redis for 5x search speedup
-- **Frontend:** React, TypeScript, Tailwind CSS, shadcn/ui, ReactFlow
-
-## Performance Benchmarks
-
-Real numbers from indexing the Zustand repository (1,174 functions):
-
-| Metric | Value |
-|--------|-------|
-| Full indexing | 29.5s (39.7 functions/sec) |
-| Incremental re-index | 0.07s (700x faster) |
-| Batch embedding | 22.9s for 1,174 functions |
-| Search (cold) | 800ms |
-| Search (cached) | 150ms |
-
-## Use Cases
-
-**For AI Agents (via MCP):**
-- Semantic code search during pair programming
-- Understanding unfamiliar codebases
-- Finding implementation patterns
-- Impact analysis before refactoring
-
-**For Development Teams:**
-- Onboarding new engineers (visualize architecture)
-- Code review prep (see change blast radius)
-- Tech debt identification (find highly coupled files)
-- Pattern enforcement (analyze style consistency)
-
-## What Makes This Different
-
-**Most code search tools:** Keyword matching (grep, GitHub search)  
-**CodeIntel:** Understands *meaning* - finds `error handling` even if the function is called `processFailure()`
-
-**Most dependency tools:** Static analysis only  
-**CodeIntel:** Combines AST parsing + semantic understanding + impact prediction
-
-**Most demos:** In-memory, doesn't scale  
-**CodeIntel:** Production-grade with Supabase persistence, Redis caching, incremental indexing
-
-## Deployment
-
-### 🐳 Local Development (Docker)
-```bash
-# Start all services
-make dev
-
-# Or using docker compose
-docker compose -f docker-compose.dev.yml up -d
-
-# Services available at:
-# - Backend: http://localhost:8000
-# - Frontend: http://localhost:3000
-# - API Docs: http://localhost:8000/docs
-```
-
-### ☁️ Production Deployment
-
-**Backend + Redis → Railway**
-```bash
-# Automated deployment
-./scripts/deploy-railway.sh
-
-# Or manually:
-railway login
-railway init
-railway up
+Frontend (React + TypeScript)
+     ↓
+Backend (FastAPI + Python)
+     ↓
+┌────┴────┬────────────┐
+Pinecone  Supabase    Redis
+(vectors) (database)  (cache)
 ```
 
-**Frontend → Vercel**
-```bash
-# Automated deployment
-./scripts/deploy-vercel.sh
+**Stack:** FastAPI, React, TypeScript, Pinecone, Supabase, Redis, tree-sitter
 
-# Or manually:
-cd frontend
-vercel --prod
-```
+## Documentation
 
-**📚 Full deployment guide:** See [DEPLOYMENT.md](DEPLOYMENT.md) for complete instructions, environment variables, and troubleshooting.
+| Guide | Description |
+|-------|-------------|
+| [Docker Quickstart](./docs/docker-quickstart.md) | Get running in 5 minutes |
+| [Deployment](./docs/deployment.md) | Production deployment guide |
+| [MCP Setup](./docs/mcp-setup.md) | Claude Desktop integration |
+| [Docker Troubleshooting](./docs/docker-troubleshooting.md) | Common issues and fixes |
 
 ## Contributing
 
-Built in a focused 2-week sprint to demonstrate production-grade AI development tooling.
+We welcome contributions! See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
 
-Contributions welcome! Areas for improvement:
-- Support for more languages (currently: Python, JS/TS)
-- Advanced graph algorithms (find circular dependencies, suggest refactorings)
-- GitHub integration (PR impact analysis)
-- Team analytics (who writes what patterns)
+**Quick links:**
+- [Open Issues](https://github.com/OpenCodeIntel/opencodeintel/issues)
+- [Good First Issues](https://github.com/OpenCodeIntel/opencodeintel/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
 
 ## License
 
-MIT License - use it, fork it, build on it.
-
-## Built With
-
-Commitment to shipping production-grade AI tools. Not a side project. Not a demo. Real infrastructure that scales.
-
----
-
-**Questions?** Open an issue or reach out.
+MIT License - see [LICENSE](./LICENSE) for details.
diff --git a/docs/docker-quickstart.md b/docs/docker-quickstart.md
index f0b113c..b888b83 100644
--- a/docs/docker-quickstart.md
+++ b/docs/docker-quickstart.md
@@ -116,7 +116,7 @@ lsof -i :8000
 **Issue:** Environment variables not found  
 **Fix:** Make sure `.env` exists in project root (not just backend/)
 
-**Full troubleshooting guide:** See `DOCKER_TROUBLESHOOTING.md`
+**Full troubleshooting guide:** See [docker-troubleshooting.md](./docker-troubleshooting.md)
 
 ## Development Mode
 
@@ -131,7 +131,7 @@ docker compose -f docker-compose.dev.yml up
 
 ## Next Steps
 
-- 📖 Read full deployment guide: `DEPLOYMENT.md`
+- Read full deployment guide: [deployment.md](./deployment.md)
 - 🚀 Deploy to Railway: `./scripts/deploy-railway.sh`
 - 🌐 Deploy to Vercel: `./scripts/deploy-vercel.sh`
 - 🧪 Run tests: See `backend/README.md`
@@ -186,11 +186,11 @@ Once local dev works, deploy to production:
    ./scripts/deploy-vercel.sh
    ```
 
-Full deployment guide: `DEPLOYMENT.md`
+Full deployment guide: [deployment.md](./deployment.md)
 
 ---
 
 **Need help?** 
-- 📖 Check `DOCKER_TROUBLESHOOTING.md`
+- Check [docker-troubleshooting.md](./docker-troubleshooting.md)
 - 🐛 Open an issue: https://github.com/OpenCodeIntel/opencodeintel/issues
 - 📝 See full docs: `README.md`
diff --git a/docs/docker-troubleshooting.md b/docs/docker-troubleshooting.md
index 2e922ee..62f3806 100644
--- a/docs/docker-troubleshooting.md
+++ b/docs/docker-troubleshooting.md
@@ -278,6 +278,6 @@ docker compose exec backend curl http://backend:8000/health
 
 1. Check GitHub Issues: https://github.com/OpenCodeIntel/opencodeintel/issues
 2. Run verification script: `./scripts/verify-setup.sh`
-3. Check DEPLOYMENT.md for step-by-step instructions
+3. Check [deployment.md](./deployment.md) for step-by-step instructions
 4. Make sure Docker Desktop has enough resources (Settings → Resources)
    - Recommended: 4GB RAM, 2 CPUs minimum