Skip to content

feat: add PageRank ranking, architecture summary, and token-budgeted responses#147

Open
maplenk wants to merge 5 commits intoDeusData:mainfrom
maplenk:feat/pagerank-arch-summary
Open

feat: add PageRank ranking, architecture summary, and token-budgeted responses#147
maplenk wants to merge 5 commits intoDeusData:mainfrom
maplenk:feat/pagerank-arch-summary

Conversation

@maplenk
Copy link

@maplenk maplenk commented Mar 26, 2026

Summary

Adds structural importance ranking (PageRank), a one-call architecture overview tool, and token-budgeted responses to prevent context window overflow.

New tools

  • get_architecture_summary — Structured markdown overview of the project: top files by connectivity, route→controller→service chains, Louvain clusters, high fan-in functions, entry points. Supports max_tokens for output size control and focus for narrowing to a specific area.

  • get_key_symbols — Returns top-K functions/classes ranked by PageRank. Enables "what are the most important functions in this codebase?" queries.

Enhanced tools

  • search_graph — New ranked parameter (default true). When enabled, results are sorted by PageRank score. PageRank included in response JSON.

  • trace_call_path — New ranked parameter. BFS results post-sorted by PageRank when enabled.

  • search_graph, trace_call_path, query_graph — New max_tokens parameter. Two-tier truncation: top 5 results in full detail, remainder as compact signatures. Emits truncated, total_results, shown metadata.

Implementation details

  • PageRank: standard iterative algorithm (d=0.85, 20 iterations) with dangling node handling. Persisted in node_scores table. Runs as pipeline post-processing step. Non-fatal on failure.
  • Architecture summary: SQL queries against existing graph — no new indexing. Hash table lookups for O(1) file resolution. yyjson route property extraction.
  • Token budget: build-then-check approach (zero overhead on happy path). Compact chain summaries (A → ... (3 more) → Z) for truncated traces.
  • WAL-mode fix: read-only query opens use immutable SQLite URIs (fixes corrupt DB misclassification).

Tests

  • test_store_arch.c: architecture summary (basic, focus, many_files, cluster_growth)
  • test_store_search.c: PageRank computation + ranking
  • test_mcp.c: get_key_symbols, ranked search, truncation for all 3 tools
  • test_pipeline.c: PageRank in pipeline
  • test_integration.c: live index tests

Motivation

AI coding agents consume 7–38% of context window per structural query. PageRank ranking ensures the most important results appear first. Token budgets let agents request "give me the answer in under 2000 tokens." Architecture summaries eliminate entire categories of exploratory queries — one call replaces 3–5 tool invocations.

Benchmarked on a 32K-node / 70K-edge production Laravel codebase.


Part 1 of a 4-PR series. PRs 2–4 build on this foundation.


Built with OpenAI Codex and Claude Code.

Naman Khator added 5 commits March 26, 2026 12:19
Account for optional signatures in the search_graph and trace_call_path size estimators, and improve compact trace chains to report omitted-node counts.

This also documents the normal-path output enrichment introduced with Task 4: search_graph results now include file_path, start_line, end_line, and signature, and trace_call_path hop items now include file_path, start_line, and signature.
maplenk added a commit to maplenk/codebase-memory-mcp that referenced this pull request Mar 26, 2026
All install paths, download URLs, self-update checks, CI workflows,
and documentation now reference maplenk/codebase-memory-mcp so the
fork can operate independently with its own releases while upstream
PRs (DeusData#147-DeusData#150) are pending. Upstream attribution in README fork
section and LICENSE preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DeusData DeusData added the enhancement New feature or request label Mar 26, 2026
@DeusData
Copy link
Owner

Thanks @maplenk — PageRank for code ranking and architecture summaries is a great idea. Large PR — will review carefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants