From 49afb645e313582b652d1126f5694616ebe7b724 Mon Sep 17 00:00:00 2001 From: Devanshu Rajesh Chicholikar Date: Mon, 11 May 2026 10:38:00 -0400 Subject: [PATCH 1/2] chore: per-folder CLAUDE.md, OCI PR template, OCI issue forms Wires the OCI pipeline scaffolding into the repo so future sessions read stack-specific rules and contributors get OCI-grounded issue and PR templates by default. backend/CLAUDE.md - Python / FastAPI / Supabase / Pinecone, the 7 documented bugs, off-limits files (auth.py, startup_checks.py). frontend/CLAUDE.md - Bun, shadcn, React Query, Tailwind, design language, critical-files list, approved file-size exceptions. mcp-server/CLAUDE.md - FastMCP, dual transport, mcp pin, MCP_API_KEY env var, Bearer header space rule, structured-error policy, 500ms agent-loop budget. .github/PULL_REQUEST_TEMPLATE.md - replaces the generic OSS template with an OCI-specific one. Summary, type, ADR adherence checklist, how-to-test, verification, deployment notes (env vars, migrations, off-limits files, mcp version, /api/v1 contract), Paired-Ship Protocol tradeoff, risk and rollback. No screenshots, no generic checklist. .github/ISSUE_TEMPLATE/oci-issue.yml - internal work tracker. Wave / type / stack / priority / ADR-required as dropdowns. Why, what-ships, acceptance-criteria as required textareas. .github/ISSUE_TEMPLATE/dogfood-finding.yml - friction-while-using-OCI capture. Fast: what-I-tried / what-happened / what-expected / repro / tool. .github/ISSUE_TEMPLATE/ddia-finding.yml - OCI flaws spotted while reading DDIA. Chapter, concept, OCI surface, what-is-wrong (DDIA-grounded), three-pillar impact ratings. bug_report.yml and feature_request.yml unchanged - those serve external contributors and stay generic. --- .github/ISSUE_TEMPLATE/ddia-finding.yml | 122 ++++++++++++++++++++ .github/ISSUE_TEMPLATE/dogfood-finding.yml | 84 ++++++++++++++ .github/ISSUE_TEMPLATE/oci-issue.yml | 127 +++++++++++++++++++++ .github/PULL_REQUEST_TEMPLATE.md | 105 ++++++++++++----- backend/CLAUDE.md | 61 ++++++++++ frontend/CLAUDE.md | 83 ++++++++++++++ mcp-server/CLAUDE.md | 66 +++++++++++ 7 files changed, 619 insertions(+), 29 deletions(-) create mode 100644 .github/ISSUE_TEMPLATE/ddia-finding.yml create mode 100644 .github/ISSUE_TEMPLATE/dogfood-finding.yml create mode 100644 .github/ISSUE_TEMPLATE/oci-issue.yml create mode 100644 backend/CLAUDE.md create mode 100644 frontend/CLAUDE.md create mode 100644 mcp-server/CLAUDE.md diff --git a/.github/ISSUE_TEMPLATE/ddia-finding.yml b/.github/ISSUE_TEMPLATE/ddia-finding.yml new file mode 100644 index 0000000..b725af7 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/ddia-finding.yml @@ -0,0 +1,122 @@ +name: DDIA Finding +description: An OCI flaw spotted while reading Designing Data-Intensive Applications. Routes reading-time insight into shipped improvements. +title: "[ddia Ch _] " +labels: ["ddia-finding", "state/backlog"] +body: + - type: markdown + attributes: + value: | + Closes the loop between DDIA reading and OCI improvements. Different shape from `dogfood-finding`: + - `dogfood-finding` = "I used OCI's tools and they failed me" + - `ddia-finding` = "I read DDIA and realized OCI does X wrong" + + Be precise. Recruiters reading these issues should see clean DDIA fluency. + + - type: input + id: chapter + attributes: + label: DDIA chapter + description: Which chapter triggered this. Update the title prefix to match (e.g. `[ddia Ch 5]`). + placeholder: "Ch 5 - Replication" + validations: + required: true + + - type: input + id: concept + attributes: + label: Specific concept + description: Which section or term inside that chapter. Be precise. + placeholder: "Read-after-write consistency under async replication" + validations: + required: true + + - type: input + id: surface + attributes: + label: OCI surface affected + description: Which file, service, or design decision is wrong. Absolute path from repo root preferred. + placeholder: "backend/services/search.py, Pinecone upsert path" + validations: + required: true + + - type: checkboxes + id: stack + attributes: + label: Stack + options: + - label: backend + - label: frontend + - label: mcp-server + - label: cross-cutting + + - type: textarea + id: whats-wrong + attributes: + label: What is wrong (DDIA-grounded) + description: 2-4 sentences. Lead with the DDIA principle, then how OCI violates it. No vibes. + placeholder: | + DDIA Ch 5 argues that asynchronous replication breaks read-after-write within a session unless reads are pinned to the leader for a brief window. OCI's search endpoint writes to Pinecone (async upsert) then can re-read in the same request via the agent loop. Users see stale results immediately after indexing, even though the API returned 200. The right shape is reading metadata from Supabase (authoritative) within the staleness window, or pinning reads after writes. + validations: + required: true + + - type: textarea + id: fix-sketch + attributes: + label: Proposed fix sketch + description: 2-3 sentences OR write "needs ADR via /oci-design". Do not overdesign here. + placeholder: needs ADR via /oci-design - the right consistency boundary is non-trivial + + - type: textarea + id: acceptance + attributes: + label: DDIA-aligned acceptance criteria + description: Each criterion ties back to a DDIA principle. This is what makes the fix verifiably correct, not just "looks better." + placeholder: | + - [ ] Read-after-write within the same request returns the freshly indexed result (DDIA Ch 5 RYW guarantee) + - [ ] Cross-session reads document staleness window in the ADR + + - type: dropdown + id: reliability + attributes: + label: Reliability impact + description: How badly does the current state hurt reliability (DDIA Ch 1 pillar)? + options: + - none + - minor + - major + - critical + validations: + required: true + + - type: dropdown + id: scalability + attributes: + label: Scalability impact + description: How badly does the current state hurt scalability? + options: + - none + - minor + - major + - critical + validations: + required: true + + - type: dropdown + id: maintainability + attributes: + label: Maintainability impact + description: How badly does the current state hurt maintainability? + options: + - none + - minor + - major + - critical + validations: + required: true + + - type: input + id: notes + attributes: + label: Chapter notes file + description: Path to your chapter notes for cross-reference. + placeholder: "learning/ddia/chapter-5.md" diff --git a/.github/ISSUE_TEMPLATE/dogfood-finding.yml b/.github/ISSUE_TEMPLATE/dogfood-finding.yml new file mode 100644 index 0000000..e550798 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/dogfood-finding.yml @@ -0,0 +1,84 @@ +name: Dogfood Finding +description: A friction you hit while using OCI's own MCP tools to build OCI. Capture in 60 seconds. Triage later. +title: "[dogfood] " +labels: ["dogfood-finding", "state/backlog", "type/fix"] +body: + - type: markdown + attributes: + value: | + Dogfooding is the forcing function. When OCI's own tools fail you mid-build, file it here in 60 seconds and keep coding. Triage happens during the Sunday `/oci-dogfood` review. + + - type: textarea + id: tried + attributes: + label: What I tried + description: The exact tool call or workflow that triggered the friction. + placeholder: | + codeintel:search_code({ + "query": "where authentication happens", + "repo_id": "78aa181e-2bbb-438b-97ee-9ffd494c4815" + }) + validations: + required: true + + - type: textarea + id: happened + attributes: + label: What happened + description: Wrong result, slow result, error, or surprising behavior. Paste output if useful. + placeholder: Returned 50 results across non-auth files. The actual auth code at `backend/middleware/auth.py` was not in the top 20. + validations: + required: true + + - type: textarea + id: expected + attributes: + label: What I expected + description: What would have been useful. + placeholder: "`backend/middleware/auth.py` in the top 3 results, since it has 'auth' in the path and contains the auth functions." + validations: + required: true + + - type: textarea + id: repro + attributes: + label: Repro steps + description: Minimal steps for someone else (or future-you) to reproduce. + placeholder: | + 1. cd into OCI repo, ensure MCP server is connected + 2. Call codeintel:search_code with the query above + 3. Note auth.py is not in the top results + validations: + required: true + + - type: dropdown + id: tool + attributes: + label: Which OCI tool + description: The MCP tool that misbehaved. + options: + - codeintel:search_code + - codeintel:get_codebase_dna + - codeintel:analyze_impact + - codeintel:get_dependencies + - other (specify in body) + validations: + required: true + + - type: checkboxes + id: stack + attributes: + label: Stack the fix likely lives in + description: Best guess. Reviewer-arch will refine during triage. + options: + - label: backend + - label: frontend + - label: mcp-server + - label: cross-cutting + + - type: textarea + id: hypothesis + attributes: + label: Hypothesis (optional) + description: If you have a guess about the root cause, capture it so future-you does not re-derive. + placeholder: Search reranker weights "auth" too literally and misses semantic matches. Or include_paths filter is dropping middleware/. diff --git a/.github/ISSUE_TEMPLATE/oci-issue.yml b/.github/ISSUE_TEMPLATE/oci-issue.yml new file mode 100644 index 0000000..bdd59c8 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/oci-issue.yml @@ -0,0 +1,127 @@ +name: OCI Issue (internal) +description: Track an OCI work item (feat / refactor / chore / docs). For external bug reports use Bug Report; for external feature requests use Feature Request. +title: "[OCI] " +labels: ["state/backlog"] +body: + - type: markdown + attributes: + value: | + Internal OCI work tracker. Every issue here is a candidate for the OCI pipeline (`/oci-design` → `/oci-implement` → `/oci-pipeline` → `/oci-defend` → merge → `/oci-merge-done`). + + Be specific. Vague issues do not earn ADRs and they do not ship. + + - type: textarea + id: why + attributes: + label: Why this exists + description: What forces this work? Tie to a Wave priority, a dogfood-finding, a DDIA-finding, a customer report, or a security/perf bound that has been hit. One paragraph. No "we want to improve X" - name what changes if we do not ship this. + placeholder: | + Search latency is 3.6s p95 (OPE-121). MCP tools used in agent loops have a 500ms budget. Without fixing this, the search MCP tool is unusable inline and Greptile / Solid integrations are dead on arrival. + validations: + required: true + + - type: textarea + id: what + attributes: + label: What ships + description: Concrete deliverables, one per bullet. Each bullet is ONE file change or ONE user-visible behavior. Do not bundle. + placeholder: | + - Replace Cohere reranker with the on-prem cross-encoder in `backend/services/search/rerank.py` + - Add `RERANKER_BACKEND` env var to `backend/config/startup_checks.py` (warning only) + - Update `mcp-server/api_client.py` to surface reranker name in tool response metadata + validations: + required: true + + - type: textarea + id: acceptance + attributes: + label: Acceptance criteria + description: Testable outcomes only. Either a unit test exists OR a manual repro is documented. "Looks good" is not a criterion. "p95 latency under 500ms on /api/v1/search" is. + placeholder: | + - [ ] `/api/v1/search` p95 < 500ms measured over 100 requests on the OCI-on-OCI corpus + - [ ] Reranker name appears in MCP tool response metadata + - [ ] Backend tests pass: `cd backend && pytest tests/ -v` + validations: + required: true + + - type: dropdown + id: wave + attributes: + label: Wave + description: Which OCI thesis wave does this serve? `pre-thesis` if filed before `oci/strategy.md` is locked. + options: + - pre-thesis + - "1" + - "2" + - "3" + validations: + required: true + + - type: dropdown + id: type + attributes: + label: Type + description: Matches commit-message type. Apply the corresponding label after filing. + options: + - feat + - fix + - refactor + - perf + - docs + - test + - chore + validations: + required: true + + - type: checkboxes + id: stack + attributes: + label: Stack scope + description: Which stacks does this touch? Tick all that apply. + options: + - label: backend + - label: frontend + - label: mcp-server + - label: cross-cutting (ADR / vault / docs / multi-stack) + + - type: dropdown + id: priority + attributes: + label: Priority + options: + - urgent + - high + - medium + - low + default: 2 + validations: + required: true + + - type: dropdown + id: adr + attributes: + label: ADR required? + description: Yes for any feat or non-trivial change. No for small fixes / docs / chore. The OCI pipeline refuses to start `/oci-implement` if `adr-required` is set and no ADR exists. + options: + - "yes - run /oci-design before /oci-implement" + - "no - under ADR threshold" + validations: + required: true + + - type: textarea + id: dogfooding + attributes: + label: Dogfooding signal + description: Did using OCI on OCI surface this need? If yes, link the moment or describe the failure. If no, write "no - filed from {plan / brainstorm / customer / regression / DDIA reading}". + placeholder: "yes - codeintel:search_code returned irrelevant results when I queried 'auth fallback path' while debugging Bug #1" + + - type: textarea + id: related + attributes: + label: Related + description: Wave milestone (when thesis locks), prior ADRs, linked issues, external references. + placeholder: | + - Wave: pre-thesis (Phase 1 pipeline work) + - Prior ADRs: n/a + - Linked: # + - External: DDIA Ch 1 percentile-vs-average discussion diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 23e1e90..1c649b2 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -1,43 +1,90 @@ -## Description +## Summary - + -## Motivation - - -## Type of Change +## Type of change -- [ ] Bug fix (non-breaking change that fixes an issue) -- [ ] New feature (non-breaking change that adds functionality) -- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) -- [ ] Documentation update -- [ ] Performance improvement -- [ ] Code refactoring +- [ ] feat (new user-visible behavior) +- [ ] fix (bug fix) +- [ ] refactor (internal restructure, no behavior change) +- [ ] perf (measured performance improvement) +- [ ] docs (documentation only) +- [ ] test (test additions, no behavior change) +- [ ] chore (tooling, deps, CI) -## Testing +## Closes / implements - + -- [ ] Added new tests -- [ ] All existing tests pass -- [ ] Tested manually with the following scenarios: +- Closes # +- ADR: `oci/decisions/` (or `n/a - under ADR threshold`) +- Stack(s): backend / frontend / mcp-server (delete what does not apply) -## Checklist +## What changed -- [ ] My code follows the project's code style -- [ ] I have commented my code where necessary -- [ ] I have updated the documentation accordingly -- [ ] My changes generate no new warnings -- [ ] I have added tests that prove my fix/feature works -- [ ] New and existing tests pass locally -- [ ] I have checked my code for security issues + -## Screenshots (if applicable) +**Backend:** +- - +**Frontend:** +- -## Additional Context +**MCP server:** +- - +**Tests / docs:** +- + +## ADR adherence + + + +- [ ] Data model matches ADR +- [ ] Consistency guarantee matches ADR (strong / read-after-write / eventual) +- [ ] Latency target honored, measured: p50 ___ ms / p95 ___ ms (ADR target: ___) +- [ ] Blast radius is what the ADR predicted +- [ ] All ADR acceptance criteria pass + +## How to test + + + +1. +2. + +Expected: + +## Verification + +- [ ] `/oci-pipeline` ran clean (BLOCKING: 0) +- [ ] CodeRabbit comments addressed (or explicit dismiss with reason) +- [ ] `/oci-defend` 7/7 passed if this is a feat / non-trivial change +- [ ] Backend tests pass: `cd backend && pytest tests/ -v` +- [ ] Frontend tests pass: `cd frontend && bun run typecheck && bun run test && bun run build` +- [ ] MCP tests pass: `cd mcp-server && pytest tests/ -v` (or `n/a` if mcp-server untouched) + +## Deployment notes + + + +- [ ] No new env vars (or list them and where they go: Railway backend / Railway mcp-server / Vercel) +- [ ] No DB migration (or paste the migration filename) +- [ ] No off-limits files modified (`backend/middleware/auth.py`, `backend/config/startup_checks.py`, `bun.lockb`) +- [ ] No `mcp` package version change (still `>=1.25.0,<2.0.0`) +- [ ] No breaking change to `/api/v1/*` contracts + +## Tradeoff (Paired-Ship Protocol) + + + + + +## Risk and rollback + +- **Symptom if wrong:** +- **Blast radius:** +- **Rollback path:** revert this PR / feature flag / migration revert / config change +- **Time-to-rollback:** < 5 min / < 30 min / requires migration revert diff --git a/backend/CLAUDE.md b/backend/CLAUDE.md new file mode 100644 index 0000000..8c21197 --- /dev/null +++ b/backend/CLAUDE.md @@ -0,0 +1,61 @@ +# backend/ Rules + +Python 3.11+, FastAPI, async I/O. Supabase (PostgreSQL + RLS), Pinecone (vector store), Redis (cache), tree-sitter (AST), OpenAI (embeddings). + +Root rules in `../CLAUDE.md` apply. This file adds backend-specific rules. + +## Code + +- Type hints on every function signature. No `Any` without a comment explaining why. +- PEP 8, max 120 char lines, snake_case for functions and variables. +- `async def` for any I/O. Sync only for pure CPU work. +- Inside `async def`, never block on sync I/O. Use `httpx.AsyncClient` (not `requests`) and `asyncio.to_thread` for CPU-bound work. +- Prefer files under 200 lines. Existing exceptions documented in root CLAUDE.md. + +## Architecture + +- Layer order: `routes/` calls `services/`. `services/` may call `middleware/auth.py` via `Depends(require_auth)` or `Depends(public_auth)`. Routes never call Pinecone / Supabase / Redis directly. +- All endpoints under `/api/v1/`. Use `APIRouter(prefix="/api/v1/...")` consistently. +- New services follow the singleton pattern. Canonical example: `services/dependency_analyzer.py`. +- Domain exceptions live in `middleware/auth.py` (auth) or each service module. Use `HTTPException` for API surface errors. Log before re-raise. + +## Graceful degradation (documented bug class) + +- Wrap optional dependency imports in `try / except ModuleNotFoundError`. Hard imports crash startup. See Bug #3 (tree-sitter-typescript) and Bug #4 (Pinecone init). +- External client init: never run at module scope. Lazy-init or wrap in `try / except` so an unreachable service does not take down the API. + +## Auth + +- `middleware/auth.py` is off-limits. Read it; never modify it. +- JWT validation MUST return `None` on `AuthenticationError`, never raise. Raising blocks the API-key fallback. See Bug #1. +- API keys: raw key (`ci_*` prefix) returned to user once on creation. Database stores SHA-256 hash only. See Bug #7. + +## Supabase + +- Row Level Security enabled on every table. New tables get an explicit `user_id = auth.uid()` policy or admin-only policy. +- Never bypass RLS via service-role client unless the route is explicitly admin-scoped (and documented as such). +- IDs: `UUID DEFAULT gen_random_uuid()`. Timestamps: `TIMESTAMPTZ`. No bigint IDs, no naive datetimes. +- Cascade deletes on parent-child relationships to prevent orphan rows. + +## Pinecone + +- Async upserts. No read-after-write guarantee within a request for the same vector. If a route writes then reads, use Supabase metadata as the authoritative store. +- Connection-refused at module init is a Bug #4 trap. Use a lazy connection. + +## include_paths and path filters + +- Use `Path.parts` tuple comparison, NEVER `str.startswith`. See Bug #5. +- `[]` is falsy in Python. Use `is not None` to test "configured", not truthiness. See Bug #6. + +## Tests + +- pytest in `tests/`. Run: `cd backend && pytest tests/ -v`. 392+ tests at HEAD. +- Mock fidelity matters more than coverage: a `Mock(return_value="result")` for Pinecone passes the test but does not reflect production. Reviewer-qa flags this. +- Every authed endpoint needs both a "JWT valid" test AND a "JWT invalid -> API key fallback works" test. The Bug #1 gap was the missing fallback test. + +## Off-limits files in backend/ + +- `middleware/auth.py` +- `config/startup_checks.py` + +If an ADR requires changing either, STOP and amend the ADR first. diff --git a/frontend/CLAUDE.md b/frontend/CLAUDE.md new file mode 100644 index 0000000..f6b143e --- /dev/null +++ b/frontend/CLAUDE.md @@ -0,0 +1,83 @@ +# frontend/ Rules + +React 18, TypeScript, Vite, Bun. Tailwind, shadcn/ui, TanStack Query, Vitest. + +Root rules in `../CLAUDE.md` apply. This file adds frontend-specific rules. + +## Package manager + +- **Bun only.** Never npm, never yarn. +- `bun install`, `bun run dev`, `bun run build`, `bun run typecheck`, `bun run test`. +- Never delete `bun.lock`. Never create `package-lock.json` or `yarn.lock`. + +## React patterns + +- Functional components with hooks. `ErrorBoundary` is the ONLY allowed class component (React requires class for error boundaries). +- Define components at module scope. Never inside another component (causes remounts on every parent render). +- Derive state during render. Avoid `useEffect` for state syncing. +- Stable keys (`item.id`), never array indices, on lists that mutate. +- Colocate state with the component that uses it. Lift only when needed. + +## Data fetching + +- All server data goes through TanStack Query (`useQuery` / `useMutation`). +- Never raw `fetch` in `useEffect`. +- Set `staleTime` based on freshness needs. `useQuery` without `staleTime` refetches on every focus. +- Custom hooks in `src/hooks/` with `use` prefix. Canonical example: `useUserUsage`. + +## Components and styling + +- Prefer shadcn/ui components over custom. `bun run shadcn add ` for new ones. +- Tailwind only. No CSS files, no styled-components. +- `cn()` from `@/lib/utils` for conditional class merging. Never template literals. +- Lists over 100 items need virtualization (`@tanstack/react-virtual` or shadcn equivalent). + +## Types + +- Shared interfaces in `src/types.ts`. Never duplicate inline. +- Prefer `interface` over `type` for object shapes (better error messages). +- Strict `tsconfig`. `any` requires a comment explaining why. + +## Design language + +- Dark theme. +- Subtle borders: `border-border/60`. +- Low-opacity card backgrounds: `bg-card/40`. +- Tier-colored accents: amber for enterprise, indigo for pro, zinc for free. +- JetBrains Mono for code, system font stack for text. +- Full-width layouts. No `max-w-4xl` constraints unless a specific page genuinely needs it. +- Never generate "AI-looking" generic designs (no purple gradients on white). + +## Tests + +- Vitest + happy-dom. Tests live in `src/test/` with `@/` alias imports. +- Run: `cd frontend && bun run typecheck && bun run test && bun run build`. +- happy-dom has limits: no real navigation, limited Canvas/WebGL. WebGL graph tests must mock Sigma.js. +- Every component test that uses TanStack Query must wrap with a fresh `QueryClientProvider` per test. +- Async UI assertions: `waitFor`, not `setTimeout`. + +## Critical files (most dependents, change carefully) + +These have the most dependents per the latest AGENTS.md run. Understand before modifying: + +- `components/ui/Skeleton.tsx` +- `components/ui/button.tsx` +- `components/ui/dropdown-menu.tsx` +- `components/landing/GitHubStars.tsx` +- `components/landing/ThemeToggle.tsx` +- `components/docs/DocsSearch.tsx` +- `components/dashboard/DashboardLayout.tsx` +- `components/dashboard/DashboardHome.tsx` + +## Approved file-size exceptions + +Most files stay under 200 lines. These are approved larger due to logical cohesion: + +- `components/DependencyGraph/GraphView/index.tsx` +- `components/DependencyGraph/MatrixView/index.tsx` + +## Off-limits files in frontend/ + +- `bun.lockb` + +Never delete or hand-edit. Let Bun manage it. diff --git a/mcp-server/CLAUDE.md b/mcp-server/CLAUDE.md new file mode 100644 index 0000000..4af2f9e --- /dev/null +++ b/mcp-server/CLAUDE.md @@ -0,0 +1,66 @@ +# mcp-server/ Rules + +FastMCP, dual transport (stdio + streamable-http), deployed to Railway as `marvelous-endurance` at `mcp.opencodeintel.com`. + +Root rules in `../CLAUDE.md` apply. This file adds MCP-server-specific rules. + +## Stack + +- `mcp>=1.25.0,<2.0.0` PINNED. Uses the private `_mcp_server` API. Do NOT upgrade to 2.x without a strategic ADR. +- Python 3.11+, type hints, async I/O. +- httpx for backend calls (`httpx.AsyncClient`). Never `requests`. +- pytest for tests. + +## File layout + +- `server.py` - entrypoint. Sets up dual transport. Wires handlers. Keep thin. +- `tools.py` - tool DEFINITIONS only (name, description, input schema). One source of truth for what tools exist. +- `handlers.py` - tool IMPLEMENTATION. Each handler maps to one tool definition. +- `api_client.py` - the HTTP client wrapping the backend's `/api/v1/*` API. All backend calls go through here. +- `formatters.py` - output formatting helpers. +- `config.py` - env validation, config dataclass. + +Never inline tool logic in `server.py`. The schema/handler split is the contract. + +## Transport + +- Dual transport: stdio (local Claude Desktop / Cursor / Cline) and streamable-http (production). +- Any change to handlers must work on BOTH. If a tool's response is large and uses streaming, document which transport supports it. +- stdio cannot stream cleanly. Tools that return async generators break local clients. + +## Auth and env + +- `MCP_API_KEY`, NEVER `API_KEY`. Railway shares env vars across services; backend uses `API_KEY=dev-secret-key` and an MCP-server `API_KEY` reference will collide. See Bug #2. +- `Authorization: Bearer {key}` header MUST include the space. `f"Bearer{key}"` is the PR #292 / commit `df958de` regression. Use f-strings carefully or extract to a helper. +- Every tool call goes through auth. No "trusted MCP context" that bypasses validation. + +## Tool design + +- Tool names are verbs (`search_code`, `analyze_impact`, `get_codebase_dna`), not nouns. Agent prompts read better. +- Every tool declares a JSON input schema with required fields, types, and enums. No `Any`. +- Tool output: deterministic string OR structured JSON. Never a generator unless the transport supports streaming AND the tool is documented as streaming-only. +- Idempotency: a tool called twice with the same input produces the same output (or the same structured error). Stateful tools must declare their stateful nature in the schema description. +- Tool composability: if tool A's output is intended as tool B's input, the schemas must align explicitly. Drift here breaks agent chaining. + +## Errors + +- Structured errors, never stack traces. A tool that raises `Exception` to the MCP transport leaks internals and crashes the agent. Catch, classify, return `{"error": "...", "code": "..."}`. +- Backend unreachable: return a structured error within 2s. Never hang on the default httpx 30s timeout. +- Rate-limited by backend: propagate the rate-limit signal to the agent (MCP-level retry hint), not a 500. + +## Latency budget + +- Tools used in agent loops (search, get_codebase_dna): target p95 < 500ms. +- Search currently inherits Cohere reranking (3.6s p95, OPE-121). Any new tool that wraps search inherits this problem until the bottleneck is fixed. +- Document the latency profile per tool in the schema description if it exceeds 500ms. + +## Tests + +- pytest in `tests/`. Run: `cd mcp-server && pytest tests/ -v`. 45+ tests at HEAD. +- Schema validation tests: every tool must have a test for malformed input that returns a structured error (not a crash). +- If a change touches transport code, test both stdio and streamable-http paths. +- Authorization header test: explicit regression for `Bearer {key}` with the space. The PR #292 bug must never recur. + +## Off-limits files in mcp-server/ + +None at the module level. Treat the `mcp>=1.25.0,<2.0.0` pin in `requirements.txt` as off-limits without an ADR. From cb05f2ee133018916202637c425a29e7e79e88dd Mon Sep 17 00:00:00 2001 From: Devanshu Rajesh Chicholikar Date: Mon, 11 May 2026 11:09:21 -0400 Subject: [PATCH 2/2] fix: review findings (MD037 underscores, bun.lock consistency) - PR template line 47: wrap latency placeholders in backticks so MD037 does not parse `___` as emphasis. - frontend/CLAUDE.md off-limits section: change `bun.lockb` to `bun.lock` to match the actual lockfile on disk and the rest of this file. Reviewer suggested the reverse, but verification against the working tree shows the text-format `bun.lock` is what Bun is using here. Skipped review finding: convert bold labels under "What changed" to ATX headings. Those labels are inline bold text, not Setext headings, so MD003 does not apply. Adding `###` headers per stack would add visual weight that contradicts the "lighter, less generic" direction this template was built for. Note: the root CLAUDE.md also has a `bun.lockb` reference inconsistency (off-limits section). Leaving that for a separate follow-up since it is not part of this PR. --- .github/PULL_REQUEST_TEMPLATE.md | 2 +- frontend/CLAUDE.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 1c649b2..9d0deb0 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -44,7 +44,7 @@ - [ ] Data model matches ADR - [ ] Consistency guarantee matches ADR (strong / read-after-write / eventual) -- [ ] Latency target honored, measured: p50 ___ ms / p95 ___ ms (ADR target: ___) +- [ ] Latency target honored, measured: p50 `` ms / p95 `` ms (ADR target: ``) - [ ] Blast radius is what the ADR predicted - [ ] All ADR acceptance criteria pass diff --git a/frontend/CLAUDE.md b/frontend/CLAUDE.md index f6b143e..b2eb357 100644 --- a/frontend/CLAUDE.md +++ b/frontend/CLAUDE.md @@ -78,6 +78,6 @@ Most files stay under 200 lines. These are approved larger due to logical cohesi ## Off-limits files in frontend/ -- `bun.lockb` +- `bun.lock` Never delete or hand-edit. Let Bun manage it.