Skip to content

refactor: update context system — goals layer, remove SQUAD.md injection#609

Merged
kokevidaurre merged 5 commits intodevelopfrom
refactor/squad-context-system
Mar 24, 2026
Merged

refactor: update context system — goals layer, remove SQUAD.md injection#609
kokevidaurre merged 5 commits intodevelopfrom
refactor/squad-context-system

Conversation

@kokevidaurre
Copy link
Contributor

Summary

  • Removed SQUAD.md body injection from agent context — SQUAD.md is now metadata only (repo, agents, config)
  • Split priorities/goals into separate layers: L2 (priorities.md) and L3 (goals.md)
  • Removed active-work.md and briefs/ from context loading (reduce noise)
  • Added direct role matching for new structured schema (role: "lead" → lead without token scoring)
  • Updated agent prompt to reflect new layer names

Context

This aligns the CLI with the structured schema migration applied to all 433 files in hq (SYSTEM.md, company.md, 19 priorities, 19 goals, 236 agents, 157 states). The context system now loads 8 layers instead of 10, with cleaner separation.

Test plan

  • squads run research -a research-lead --verbose --dry-run — verify layers load correctly
  • Verify scanner gets L1-L5 only
  • Verify lead gets L1-L8
  • Verify goals.md appears as separate layer
  • Verify SQUAD.md body is NOT in context output
  • npm run build passes

🤖 Generated with Claude Code

Context loading changes:
- Removed L2 (SQUAD.md body injection) — SQUAD.md is now metadata
  only for CLI routing (repo, agents, config). Not injected into prompt.
- Split old L3 (priorities OR goals) into L2 (priorities.md) and
  L3 (goals.md) as separate layers loaded independently.
- Removed L7 (active-work.md) and L8 (briefs/) from context loading.
  These files still exist but no longer consume context budget.
- Renumbered: L6=feedback, L7=daily-briefing, L8=cross-squad learnings.

Role-based access updated:
- scanner: L1-L5 (company, priorities, goals, agent, state)
- worker/verifier: L1-L6 (+ feedback)
- lead/coo: L1-L8 (+ daily briefing + cross-squad)

Role resolution:
- Direct match for new schema (role: "lead" → lead, no scoring needed)
- Falls back to token scoring for legacy free-text roles

Agent prompt updated to reflect new layer names.

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions bot added the core label Mar 24, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the agent context system, moving from a 10-section cascade to a more structured 8-layer approach. This change aims to provide cleaner separation of concerns and more precise context injection for agents. Key improvements include the removal of SQUAD.md content from prompts, the introduction of a distinct goals layer, and a more robust method for determining an agent's context role, which now leverages agent YAML frontmatter for direct role matching.

Highlights

  • Context System Refactor: The agent context system has been refactored from a 10-section cascade to a more structured 8-layer approach, providing clearer separation and more precise context injection.
  • SQUAD.md Injection Removed: The body content of SQUAD.md is no longer injected into agent prompts; it now serves purely as metadata for repository, agents, and configuration.
  • Goals Layer Introduced: Priorities and goals are now split into separate layers: L2 for priorities.md and L3 for goals.md, allowing for distinct focus on current urgency and measurable targets.
  • Enhanced Role Resolution: A new mechanism for resolving agent context roles has been implemented, leveraging the agent's YAML frontmatter role: field for direct matching and including a token-based scoring system with an LLM fallback for ambiguous cases.
  • Context Noise Reduction: Context loading for active-work.md and agent/squad briefs has been removed to reduce unnecessary noise in agent prompts.
  • Updated Agent Prompt: The agent prompt has been streamlined to clearly list the new, structured context layers (SYSTEM, Company, Priorities, Goals, Agent, State) that agents should read.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the context system to use a layered approach, which improves clarity and structure. The removal of SQUAD.md from the context and the separation of priorities and goals are positive changes. However, I've identified a critical security vulnerability related to command injection when using the LLM-based role classifier. There are also a couple of other issues in the context assembly logic that should be addressed.

Comment on lines +374 to +377
const out = execSync(
`claude --print --dangerously-skip-permissions --disable-slash-commands --model ${model} -- '${escapedPrompt}'`,
{ encoding: 'utf-8', timeout: 60_000, maxBuffer: 2 * 1024 * 1024 }
).trim().toLowerCase();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

Using execSync here introduces a critical command injection vulnerability. The model variable, derived from an environment variable, is not sanitized before being used in the command string. An attacker who can control the SQUADS_CONTEXT_ROLE_LLM_MODEL environment variable could execute arbitrary commands. For example, setting it to foo; rm -rf / would be disastrous.

Additionally, the --dangerously-skip-permissions flag is highly concerning from a security perspective as it bypasses the tool's own safety mechanisms.

I strongly recommend replacing execSync with a secure alternative, like making a direct API call to the Claude service if a library is available. If you must use execSync, the model variable must be strictly validated against an allowlist of known model names to prevent injection.

Comment on lines 451 to 453
if (text.length > cap) {
text = text.substring(0, cap) + '\n...';
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a bug in the logic for truncating context layers. When a layer's content is truncated, the added ellipsis ('\n...') can push the total size over the budget, causing the layer to be dropped entirely. The truncation logic should account for the length of the ellipsis to ensure the content fits within the budget.

    if (text.length > cap) {
      const ellipsis = '\n...';
      text = text.substring(0, Math.max(0, cap - ellipsis.length)) + ellipsis;
    }

Comment on lines +385 to +387
} catch {
return 'worker';
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This try...catch block currently swallows any errors from the LLM-based role classification, failing silently and defaulting to the 'worker' role. This can make debugging issues with the classification process very difficult. It would be beneficial to log the error that was caught to provide visibility into why the fallback occurred.

  } catch (e) {
    writeLine(`  ${colors.dim}warn: LLM-based role classification failed, falling back to 'worker'. Error: ${e instanceof Error ? e.message : String(e)}${RESET}`);
    return 'worker';
  }

Jorge Vidaurre and others added 2 commits March 23, 2026 23:47
company.md, priorities.md, goals.md, state.md all have YAML
frontmatter for CLI metadata. LLMs don't need it — strip before
injecting into prompt. Saves ~80 tokens per run.

Also: DRYRUN_CONTEXT_MAX_CHARS now configurable via env var
SQUADS_DRYRUN_MAX_CHARS for debugging full context output.

Co-Authored-By: Claude <noreply@anthropic.com>
New commands:
- squads catalog list — show all services grouped by type
- squads catalog show <service> — detailed service view
- squads catalog check [service] — run scorecard checks (all or one)
- squads release pre-check <service> — validate dependencies before deploy

New lib modules:
- lib/idp/types.ts — TypeScript interfaces matching IDP YAML schema
- lib/idp/resolver.ts — find IDP directory (env var → co-located → sibling → absolute)
- lib/idp/catalog-loader.ts — parse YAML catalog entries via gray-matter
- lib/idp/scorecard-engine.ts — evaluate services against quality checks

Scorecard sources: local filesystem, gh CLI, git log. Graceful
degradation when gh is unavailable (shows "unknown" vs failing).

No new dependencies — YAML parsed via gray-matter's engine.

Co-Authored-By: Claude <noreply@anthropic.com>
Dockerfile.fresh-user: clean Node 22 container, npm install -g
squads-cli, empty git repo. No config, no .agents, nothing.

test-fresh-user.sh: 9-step automated test suite covering the
complete first-run flow (version, help, init, status, list,
catalog, doctor, unknown command).

Current results: 4/9 pass. squads init is broken (#610).

Usage:
  ./test/docker/test-fresh-user.sh --auto    # automated
  ./test/docker/test-fresh-user.sh           # interactive

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions bot added the tests label Mar 24, 2026
Fixes TypeScript build error: TS2304 Cannot find name 'CatalogEntry'

Co-Authored-By: Claude <noreply@anthropic.com>
@kokevidaurre kokevidaurre merged commit 3653a7e into develop Mar 24, 2026
11 checks passed
@kokevidaurre kokevidaurre deleted the refactor/squad-context-system branch March 24, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant