refactor: update context system — goals layer, remove SQUAD.md injection#609
refactor: update context system — goals layer, remove SQUAD.md injection#609kokevidaurre merged 5 commits intodevelopfrom
Conversation
Context loading changes: - Removed L2 (SQUAD.md body injection) — SQUAD.md is now metadata only for CLI routing (repo, agents, config). Not injected into prompt. - Split old L3 (priorities OR goals) into L2 (priorities.md) and L3 (goals.md) as separate layers loaded independently. - Removed L7 (active-work.md) and L8 (briefs/) from context loading. These files still exist but no longer consume context budget. - Renumbered: L6=feedback, L7=daily-briefing, L8=cross-squad learnings. Role-based access updated: - scanner: L1-L5 (company, priorities, goals, agent, state) - worker/verifier: L1-L6 (+ feedback) - lead/coo: L1-L8 (+ daily briefing + cross-squad) Role resolution: - Direct match for new schema (role: "lead" → lead, no scoring needed) - Falls back to token scoring for legacy free-text roles Agent prompt updated to reflect new layer names. Co-Authored-By: Claude <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the agent context system, moving from a 10-section cascade to a more structured 8-layer approach. This change aims to provide cleaner separation of concerns and more precise context injection for agents. Key improvements include the removal of SQUAD.md content from prompts, the introduction of a distinct goals layer, and a more robust method for determining an agent's context role, which now leverages agent YAML frontmatter for direct role matching. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively refactors the context system to use a layered approach, which improves clarity and structure. The removal of SQUAD.md from the context and the separation of priorities and goals are positive changes. However, I've identified a critical security vulnerability related to command injection when using the LLM-based role classifier. There are also a couple of other issues in the context assembly logic that should be addressed.
| const out = execSync( | ||
| `claude --print --dangerously-skip-permissions --disable-slash-commands --model ${model} -- '${escapedPrompt}'`, | ||
| { encoding: 'utf-8', timeout: 60_000, maxBuffer: 2 * 1024 * 1024 } | ||
| ).trim().toLowerCase(); |
There was a problem hiding this comment.
Using execSync here introduces a critical command injection vulnerability. The model variable, derived from an environment variable, is not sanitized before being used in the command string. An attacker who can control the SQUADS_CONTEXT_ROLE_LLM_MODEL environment variable could execute arbitrary commands. For example, setting it to foo; rm -rf / would be disastrous.
Additionally, the --dangerously-skip-permissions flag is highly concerning from a security perspective as it bypasses the tool's own safety mechanisms.
I strongly recommend replacing execSync with a secure alternative, like making a direct API call to the Claude service if a library is available. If you must use execSync, the model variable must be strictly validated against an allowlist of known model names to prevent injection.
| if (text.length > cap) { | ||
| text = text.substring(0, cap) + '\n...'; | ||
| } |
There was a problem hiding this comment.
There's a bug in the logic for truncating context layers. When a layer's content is truncated, the added ellipsis ('\n...') can push the total size over the budget, causing the layer to be dropped entirely. The truncation logic should account for the length of the ellipsis to ensure the content fits within the budget.
if (text.length > cap) {
const ellipsis = '\n...';
text = text.substring(0, Math.max(0, cap - ellipsis.length)) + ellipsis;
}| } catch { | ||
| return 'worker'; | ||
| } |
There was a problem hiding this comment.
This try...catch block currently swallows any errors from the LLM-based role classification, failing silently and defaulting to the 'worker' role. This can make debugging issues with the classification process very difficult. It would be beneficial to log the error that was caught to provide visibility into why the fallback occurred.
} catch (e) {
writeLine(` ${colors.dim}warn: LLM-based role classification failed, falling back to 'worker'. Error: ${e instanceof Error ? e.message : String(e)}${RESET}`);
return 'worker';
}company.md, priorities.md, goals.md, state.md all have YAML frontmatter for CLI metadata. LLMs don't need it — strip before injecting into prompt. Saves ~80 tokens per run. Also: DRYRUN_CONTEXT_MAX_CHARS now configurable via env var SQUADS_DRYRUN_MAX_CHARS for debugging full context output. Co-Authored-By: Claude <noreply@anthropic.com>
New commands: - squads catalog list — show all services grouped by type - squads catalog show <service> — detailed service view - squads catalog check [service] — run scorecard checks (all or one) - squads release pre-check <service> — validate dependencies before deploy New lib modules: - lib/idp/types.ts — TypeScript interfaces matching IDP YAML schema - lib/idp/resolver.ts — find IDP directory (env var → co-located → sibling → absolute) - lib/idp/catalog-loader.ts — parse YAML catalog entries via gray-matter - lib/idp/scorecard-engine.ts — evaluate services against quality checks Scorecard sources: local filesystem, gh CLI, git log. Graceful degradation when gh is unavailable (shows "unknown" vs failing). No new dependencies — YAML parsed via gray-matter's engine. Co-Authored-By: Claude <noreply@anthropic.com>
Dockerfile.fresh-user: clean Node 22 container, npm install -g squads-cli, empty git repo. No config, no .agents, nothing. test-fresh-user.sh: 9-step automated test suite covering the complete first-run flow (version, help, init, status, list, catalog, doctor, unknown command). Current results: 4/9 pass. squads init is broken (#610). Usage: ./test/docker/test-fresh-user.sh --auto # automated ./test/docker/test-fresh-user.sh # interactive Co-Authored-By: Claude <noreply@anthropic.com>
Fixes TypeScript build error: TS2304 Cannot find name 'CatalogEntry' Co-Authored-By: Claude <noreply@anthropic.com>
Summary
role: "lead"→ lead without token scoring)Context
This aligns the CLI with the structured schema migration applied to all 433 files in hq (SYSTEM.md, company.md, 19 priorities, 19 goals, 236 agents, 157 states). The context system now loads 8 layers instead of 10, with cleaner separation.
Test plan
squads run research -a research-lead --verbose --dry-run— verify layers load correctlynpm run buildpasses🤖 Generated with Claude Code