feat: add PageRank ranking, architecture summary, and token-budgeted responses#147
Open
maplenk wants to merge 5 commits intoDeusData:mainfrom
Open
feat: add PageRank ranking, architecture summary, and token-budgeted responses#147maplenk wants to merge 5 commits intoDeusData:mainfrom
maplenk wants to merge 5 commits intoDeusData:mainfrom
Conversation
added 5 commits
March 26, 2026 12:19
Account for optional signatures in the search_graph and trace_call_path size estimators, and improve compact trace chains to report omitted-node counts. This also documents the normal-path output enrichment introduced with Task 4: search_graph results now include file_path, start_line, end_line, and signature, and trace_call_path hop items now include file_path, start_line, and signature.
maplenk
added a commit
to maplenk/codebase-memory-mcp
that referenced
this pull request
Mar 26, 2026
All install paths, download URLs, self-update checks, CI workflows, and documentation now reference maplenk/codebase-memory-mcp so the fork can operate independently with its own releases while upstream PRs (DeusData#147-DeusData#150) are pending. Upstream attribution in README fork section and LICENSE preserved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Owner
|
Thanks @maplenk — PageRank for code ranking and architecture summaries is a great idea. Large PR — will review carefully. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds structural importance ranking (PageRank), a one-call architecture overview tool, and token-budgeted responses to prevent context window overflow.
New tools
get_architecture_summary— Structured markdown overview of the project: top files by connectivity, route→controller→service chains, Louvain clusters, high fan-in functions, entry points. Supportsmax_tokensfor output size control andfocusfor narrowing to a specific area.get_key_symbols— Returns top-K functions/classes ranked by PageRank. Enables "what are the most important functions in this codebase?" queries.Enhanced tools
search_graph— Newrankedparameter (default true). When enabled, results are sorted by PageRank score. PageRank included in response JSON.trace_call_path— Newrankedparameter. BFS results post-sorted by PageRank when enabled.search_graph,trace_call_path,query_graph— Newmax_tokensparameter. Two-tier truncation: top 5 results in full detail, remainder as compact signatures. Emitstruncated,total_results,shownmetadata.Implementation details
node_scorestable. Runs as pipeline post-processing step. Non-fatal on failure.A → ... (3 more) → Z) for truncated traces.Tests
test_store_arch.c: architecture summary (basic, focus, many_files, cluster_growth)test_store_search.c: PageRank computation + rankingtest_mcp.c: get_key_symbols, ranked search, truncation for all 3 toolstest_pipeline.c: PageRank in pipelinetest_integration.c: live index testsMotivation
AI coding agents consume 7–38% of context window per structural query. PageRank ranking ensures the most important results appear first. Token budgets let agents request "give me the answer in under 2000 tokens." Architecture summaries eliminate entire categories of exploratory queries — one call replaces 3–5 tool invocations.
Benchmarked on a 32K-node / 70K-edge production Laravel codebase.
Part 1 of a 4-PR series. PRs 2–4 build on this foundation.
Built with OpenAI Codex and Claude Code.