Skip to content

Commit d04b8fb

Browse files
committed
docs: update README with Docker quick start and project overview
- Add Docker quick start section at top of README - Link to DOCKER_QUICKSTART.md and DOCKER_TROUBLESHOOTING.md - Document complete feature set and performance benchmarks - Add use cases for both AI agents and development teams - Include production deployment section - Highlight production-grade architecture and scaling Key metrics showcased: - 100x faster batch indexing (40min → 22.9sec) - 700x faster incremental updates (51.4s → 0.07s) - 5x search speedup with caching (800ms → 150ms) Positions project as production-ready, not a demo or toy project Perfect for HN launch with technical depth and real-world metrics
1 parent 08143a6 commit d04b8fb

2 files changed

Lines changed: 292 additions & 2 deletions

File tree

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2025 OpenCodeIntel
3+
Copyright (c) 2025 Devanshu Sharma
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 291 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,291 @@
1-
# opencodeintel
1+
# CodeIntel MCP
2+
3+
**MCP server for AI-powered codebase intelligence.** Semantic search, dependency analysis, and impact prediction for your repositories.
4+
5+
## The Problem
6+
7+
AI coding assistants are powerful, but they're flying blind in large codebases:
8+
- Can't semantically search across thousands of files
9+
- Don't understand dependency relationships
10+
- Can't predict what breaks when you change a file
11+
- Have no context on team coding patterns
12+
13+
## The Solution
14+
15+
CodeIntel is an MCP (Model Context Protocol) server that gives AI agents deep codebase understanding:
16+
17+
```typescript
18+
// Ask Claude (via MCP):
19+
"Find authentication middleware in this repo"
20+
21+
// CodeIntel semantically searches 10,000+ functions
22+
// Returns exact implementations, not keyword matches
23+
```
24+
25+
**Built for production. Not a demo.**
26+
27+
## Key Features
28+
29+
### 🔍 Semantic Code Search
30+
Search by meaning, not keywords. Find `"error handling logic"` even if functions are named `processFailure()`.
31+
32+
### 📊 Dependency Analysis
33+
Visualize your entire codebase architecture. See which files are critical, which are isolated, and how everything connects.
34+
35+
### ⚡ Impact Prediction
36+
Before changing a file, know exactly what breaks:
37+
```
38+
src/auth/middleware.py
39+
└─ 15 files affected (HIGH RISK)
40+
├─ src/api/routes.py
41+
├─ src/services/user.py
42+
└─ ... + 12 more
43+
```
44+
45+
### 🎨 Code Style Analysis
46+
Understand team patterns: naming conventions (camelCase vs snake_case), async adoption %, type hint usage.
47+
48+
### 🚀 Performance That Scales
49+
50+
**Batch Processing:** 100x faster indexing
51+
- Before: 40+ min for 1,000 functions (individual API calls)
52+
- After: 22.9 sec (batch embedding requests)
53+
54+
**Incremental Indexing:** 700x faster re-indexing
55+
- Full re-index: 51.4s
56+
- Incremental (git diff): 0.07s
57+
- Perfect for active development
58+
59+
**Supabase Caching:** 5x search speedup
60+
- Cold search: 800ms
61+
- Cached: 150ms
62+
63+
## Quick Start
64+
65+
### 🐳 Docker (Recommended)
66+
67+
**Fastest way to get started:**
68+
69+
```bash
70+
# 1. Clone repo
71+
git clone https://github.com/DevanshuNEU/v1--codeintel.git
72+
cd v1--codeintel
73+
74+
# 2. Configure environment
75+
cp .env.example .env
76+
# Edit .env with your API keys
77+
78+
# 3. Start everything
79+
docker compose up -d
80+
81+
# Frontend: http://localhost:3000
82+
# Backend: http://localhost:8000
83+
# Docs: http://localhost:8000/docs
84+
```
85+
86+
**Full guide:** [`DOCKER_QUICKSTART.md`](./DOCKER_QUICKSTART.md)
87+
**Troubleshooting:** [`DOCKER_TROUBLESHOOTING.md`](./DOCKER_TROUBLESHOOTING.md)
88+
89+
---
90+
91+
### 📦 Manual Setup
92+
93+
### Prerequisites
94+
- Python 3.11+
95+
- Node.js 20+
96+
- OpenAI API key
97+
- Pinecone account
98+
- Supabase project
99+
100+
### 1. Clone & Setup Backend
101+
102+
```bash
103+
cd backend
104+
python -m venv venv
105+
source venv/bin/activate # Windows: venv\Scripts\activate
106+
pip install -r requirements.txt
107+
108+
# Configure .env
109+
cp .env.example .env
110+
# Add your API keys to .env
111+
```
112+
113+
### 2. Run Backend
114+
115+
```bash
116+
python main.py
117+
# Server runs on http://localhost:8000
118+
```
119+
120+
### 3. Setup Frontend
121+
122+
```bash
123+
cd frontend
124+
npm install
125+
npm run dev
126+
# UI at http://localhost:5173
127+
```
128+
129+
### 4. Add a Repository
130+
131+
```bash
132+
# Via API
133+
curl -X POST http://localhost:8000/api/repos \
134+
-H "Authorization: Bearer dev-secret-key" \
135+
-H "Content-Type: application/json" \
136+
-d '{"name": "zustand", "git_url": "https://github.com/pmndrs/zustand"}'
137+
138+
# Or use the web UI
139+
```
140+
141+
## MCP Integration
142+
143+
CodeIntel works as an MCP server with Claude Desktop:
144+
145+
```json
146+
// Add to Claude Desktop config (~/.config/claude/config.json)
147+
{
148+
"mcpServers": {
149+
"codeintel": {
150+
"command": "python",
151+
"args": ["/path/to/pebble/mcp-server/server.py"]
152+
}
153+
}
154+
}
155+
```
156+
157+
**Available MCP Tools:**
158+
- `search_code` - Semantic code search
159+
- `list_repositories` - View indexed repos
160+
- `get_dependency_graph` - Analyze architecture
161+
- `analyze_code_style` - Team patterns
162+
- `analyze_impact` - Change impact prediction
163+
- `get_repository_insights` - Comprehensive metrics
164+
165+
Now ask Claude: *"What's the authentication logic in the user service?"* and it searches your actual codebase.
166+
167+
## Architecture
168+
169+
```
170+
┌─────────────┐
171+
│ Frontend │ React + TypeScript + Tailwind
172+
│ (Vite app) │ Dependency graphs, search UI
173+
└──────┬──────┘
174+
175+
┌──────▼──────┐
176+
│ FastAPI │ Python backend
177+
│ Backend │ /api/search, /api/repos/{id}/dependencies
178+
└──────┬──────┘
179+
180+
├─────► Pinecone (vector search)
181+
├─────► OpenAI (embeddings)
182+
├─────► Supabase (persistence)
183+
└─────► Redis (caching)
184+
```
185+
186+
**Tech Stack:**
187+
- **Backend:** FastAPI, tree-sitter (AST parsing), OpenAI embeddings
188+
- **Vector DB:** Pinecone for semantic search
189+
- **Database:** Supabase (PostgreSQL) for metadata + caching
190+
- **Cache:** Redis for 5x search speedup
191+
- **Frontend:** React, TypeScript, Tailwind CSS, shadcn/ui, ReactFlow
192+
193+
## Performance Benchmarks
194+
195+
Real numbers from indexing the Zustand repository (1,174 functions):
196+
197+
| Metric | Value |
198+
|--------|-------|
199+
| Full indexing | 29.5s (39.7 functions/sec) |
200+
| Incremental re-index | 0.07s (700x faster) |
201+
| Batch embedding | 22.9s for 1,174 functions |
202+
| Search (cold) | 800ms |
203+
| Search (cached) | 150ms |
204+
205+
## Use Cases
206+
207+
**For AI Agents (via MCP):**
208+
- Semantic code search during pair programming
209+
- Understanding unfamiliar codebases
210+
- Finding implementation patterns
211+
- Impact analysis before refactoring
212+
213+
**For Development Teams:**
214+
- Onboarding new engineers (visualize architecture)
215+
- Code review prep (see change blast radius)
216+
- Tech debt identification (find highly coupled files)
217+
- Pattern enforcement (analyze style consistency)
218+
219+
## What Makes This Different
220+
221+
**Most code search tools:** Keyword matching (grep, GitHub search)
222+
**CodeIntel:** Understands *meaning* - finds `error handling` even if the function is called `processFailure()`
223+
224+
**Most dependency tools:** Static analysis only
225+
**CodeIntel:** Combines AST parsing + semantic understanding + impact prediction
226+
227+
**Most demos:** In-memory, doesn't scale
228+
**CodeIntel:** Production-grade with Supabase persistence, Redis caching, incremental indexing
229+
230+
## Deployment
231+
232+
### 🐳 Local Development (Docker)
233+
```bash
234+
# Start all services
235+
make dev
236+
237+
# Or using docker compose
238+
docker compose -f docker-compose.dev.yml up -d
239+
240+
# Services available at:
241+
# - Backend: http://localhost:8000
242+
# - Frontend: http://localhost:3000
243+
# - API Docs: http://localhost:8000/docs
244+
```
245+
246+
### ☁️ Production Deployment
247+
248+
**Backend + Redis → Railway**
249+
```bash
250+
# Automated deployment
251+
./scripts/deploy-railway.sh
252+
253+
# Or manually:
254+
railway login
255+
railway init
256+
railway up
257+
```
258+
259+
**Frontend → Vercel**
260+
```bash
261+
# Automated deployment
262+
./scripts/deploy-vercel.sh
263+
264+
# Or manually:
265+
cd frontend
266+
vercel --prod
267+
```
268+
269+
**📚 Full deployment guide:** See [DEPLOYMENT.md](DEPLOYMENT.md) for complete instructions, environment variables, and troubleshooting.
270+
271+
## Contributing
272+
273+
Built in a focused 2-week sprint to demonstrate production-grade AI development tooling.
274+
275+
Contributions welcome! Areas for improvement:
276+
- Support for more languages (currently: Python, JS/TS)
277+
- Advanced graph algorithms (find circular dependencies, suggest refactorings)
278+
- GitHub integration (PR impact analysis)
279+
- Team analytics (who writes what patterns)
280+
281+
## License
282+
283+
MIT License - use it, fork it, build on it.
284+
285+
## Built With
286+
287+
Commitment to shipping production-grade AI tools. Not a side project. Not a demo. Real infrastructure that scales.
288+
289+
---
290+
291+
**Questions?** Open an issue or reach out.

0 commit comments

Comments
 (0)