Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 31 additions & 5 deletions skills/skill-creator/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,34 @@ Explain the reasoning behind constraints rather than issuing bare imperatives.
effective than "ALWAYS run with -race" because the model can generalize the
reasoning to situations the skill author didn't anticipate.

**Progressive disclosure** — keep SKILL.md navigable:
- Summary in frontmatter, workflow in body, deep reference in `references/`
- If SKILL.md exceeds ~500 lines, move detailed catalogs to reference files
- Reference files clearly linked from SKILL.md with guidance on when to read them
**Progressive disclosure** — SKILL.md is the routing target, not the reference
library. It stays lean so it loads fast when Claude considers invoking it, then
reads `references/` on demand as phases execute. See
`references/progressive-disclosure.md` for the full model, economics, and
extraction decision tree.

Key rules:
- SKILL.md: brief overview, phase structure with gates, one-line pointers to
reference files, error handling
- `references/`: checklists, rubrics, agent dispatch prompts, report templates,
pattern catalogs, example collections — anything only needed at execution time
- If SKILL.md exceeds **500 lines** after writing, extract detailed content to
`references/` before proceeding
- If SKILL.md exceeds **700 lines**, extraction is mandatory — it is carrying
reference content that should not be loaded on every routing decision

**Maximizing skill effectiveness:**

| More of this → better skill | Why |
|-----------------------------|-----|
| Rich `references/` content | Depth available at execution; zero cost at routing time |
| Deterministic `scripts/` | Consistency, token savings, independent testability |
| Bundled `agents/` prompts | Specialized dispatch without routing system overhead |

The most effective complex skills in this toolkit (`comprehensive-review`,
`sapcc-review`, `voice-writer`) have SKILL.md under 600 lines and put all
operational depth in `references/` and `agents/`. See
`references/progressive-disclosure.md` for the real numbers.

### Bundled scripts

Expand Down Expand Up @@ -352,9 +376,11 @@ skill. Read them when you need to spawn the relevant subagent.

## Reference files

- `references/progressive-disclosure.md` — The disclosure model: economics, size
gates, what to extract, real examples from the toolkit, script and agent patterns
- `references/skill-template.md` — Complete SKILL.md template with all sections
- `references/artifact-schemas.md` — JSON schemas for eval artifacts (evals.json,
grading.json, benchmark.json, comparison.json, timing.json, metrics.json)
- `references/skill-template.md` — Complete SKILL.md template with all sections
- `references/complexity-tiers.md` — Skill examples by complexity tier
- `references/workflow-patterns.md` — Reusable phase structures and gate patterns
- `references/error-catalog.md` — Common skill creation errors with solutions
Expand Down
217 changes: 217 additions & 0 deletions skills/skill-creator/references/progressive-disclosure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
# Progressive Disclosure Model

How to structure skills so they load fast when Claude considers them and deliver
full depth when Claude executes them.

---

## The Core Model

```
SKILL.md ← always loaded when Claude considers invoking the skill
references/ ← loaded on demand as the skill executes
scripts/ ← deterministic CLI tools, called from SKILL.md phases
agents/ ← specialized subagent prompts, dispatched from SKILL.md
```

**SKILL.md** is the routing target. It stays lean so it loads fast, then reads
reference files on demand as phases execute.

**`references/`** holds deep content: checklists, rubrics, templates, patterns,
agent dispatch prompts, scoring systems, example collections. Loaded only when
the skill is actually running and reaches the phase that needs them.

**`scripts/`** holds deterministic CLI tools. If an operation is repeatable and
doesn't require LLM judgment, it should be a Python script — not inline
instructions that the model reinvents each run.

**`agents/`** holds specialized subagent prompts for skills that dispatch
parallel reviewers, graders, or domain specialists. Each agent file contains
the full prompt for one specialized role.

---

## The Economics

| Moment | What loads | Token cost |
|--------|------------|------------|
| Claude considers invoking the skill | SKILL.md only | Low (300–400 lines) |
| Skill executes Phase 1 | SKILL.md + Phase 1 reference | Medium |
| Skill executes all phases | SKILL.md + all referenced files | Full depth |

A 300-line SKILL.md with 5 reference files totaling 800 lines costs **300 tokens
to consider** and **1100 tokens when executing**. A 1100-line SKILL.md costs
1100 tokens on every routing decision, whether or not the skill gets invoked.

This is the key asymmetry. Keep SKILL.md lean.

---

## Size Gates

| SKILL.md length | Action |
|-----------------|--------|
| Under 400 lines | Fine — no extraction needed |
| 400–500 lines | Consider extracting if there are obvious deep-content sections |
| Over 500 lines | Should extract detailed catalogs to `references/` |
| Over 700 lines | Must extract — SKILL.md is carrying reference content |

After writing a SKILL.md, check its length. If it exceeds 500 lines, identify
the heaviest sections (checklists, rubrics, pattern catalogs, agent prompts,
example collections) and move them to `references/`.

---

## What to Extract to `references/`

**Extract these** — they are deep content that only matters when the skill runs:

- Detailed checklists and rubrics (e.g., severity classification tables, joy-check
rubric, grading criteria)
- Agent dispatch prompts (e.g., the 10 specialist prompts in `sapcc-review`, wave
agent prompts in `comprehensive-review`)
- Report and output templates (e.g., the structured markdown template for
`sapcc-review` findings)
- Domain-specific pattern catalogs (e.g., Go anti-patterns with before/after
examples, common error patterns)
- Validation criteria and scoring systems
- Example collections (realistic input/output pairs, prompt examples)
- Phase-specific deep guides (e.g., "how to run the voice extraction phase")

**Keep in SKILL.md** — these guide routing and orchestration:

- Frontmatter (name, description, routing — never extracted)
- Brief overview (2-3 sentences)
- Phase/step structure with gates
- One-line pointers to reference files ("See `references/X.md` for...")
- Error handling (cause/solution pairs for common failures)
- Brief examples showing trigger context

---

## Real Examples from This Toolkit

These skills were built following this model. Use them as reference.

| Skill | SKILL.md | `references/` | Total | What's in references |
|-------|----------|----------------|-------|----------------------|
| `comprehensive-review` | 564 lines | 765 lines (5 files) | 1329 | Wave-specific agent prompts per wave |
| `create-voice` | 444 lines | 426 lines (4 files) | 870 | Phase-specific deep guides |
| `pr-pipeline` | 417 lines | 365 lines (4 files) | 782 | Checklist, templates, loop details |
| `sapcc-review` | 269 lines | 323 lines (2 files) | 592 | 10 agent dispatch prompts, report template |
| `systematic-code-review` | 301 lines | 252 lines (3 files) | 553 | Severity rules, Go patterns, feedback guide |
| `voice-writer` | 307 lines | 462 lines (6 files) | 769 | Rubrics, checklists, joy-check criteria, schemas |

Notice that the most complex skills (`comprehensive-review`, `sapcc-review`) have
the *smallest* SKILL.md-to-total ratios. All their operational depth lives in
`references/` and `agents/`, loaded only when the skill executes.

### Pattern: Agent Dispatch Prompts in `agents/`

`sapcc-review` dispatches 10 parallel domain-specialist agents. Their prompts
live in `agents/` (one file per specialist). SKILL.md says:

```
Spawn 10 parallel subagents, each loaded with their agent prompt from agents/:
- agents/error-handling-reviewer.md
- agents/api-contracts-reviewer.md
...
```

SKILL.md stays at 269 lines. The 10 agent prompts are only loaded when the
skill actually runs.

### Pattern: Wave Prompts in `references/`

`comprehensive-review` runs 4 waves of parallel review. Each wave's agent
prompts are in a separate reference file (`references/wave1-agents.md`, etc.).
SKILL.md describes the structure; the actual prompts are loaded per-wave.

### Pattern: Checklist Extraction

`pr-pipeline` has a pre-PR checklist that would bulk out SKILL.md. It lives in
`references/pre-pr-checklist.md`. SKILL.md says: "Before creating the PR, work
through `references/pre-pr-checklist.md`."

---

## Deterministic Script Principle

If an operation is repeatable and doesn't require LLM judgment, it **should** be
a Python CLI script in `scripts/`, not inline instructions that the model
reinvents on each invocation.

Scripts:
- Save tokens — the model calls a script rather than reasoning through the same
steps from scratch each time
- Ensure consistency — the same input produces the same output every run
- Can be tested independently — unit tests for scripts, not for model reasoning
- Are version-controlled and reviewable — changes are explicit diffs
- Have predictable outputs — scripts fail deterministically; model reasoning fails
silently

**Good candidates for scripts:**
- Validation (voice validation, format checking, lint)
- Metric extraction (line counts, token counts, benchmark aggregation)
- Template rendering (fill a report template with data)
- Link checking, path resolution, file discovery
- Format conversion (CSV to JSON, markdown to HTML)
- API calls with structured output (GitHub, linear, Slack)

**Keep as SKILL.md instructions** — things that require judgment:
- Deciding what to review and how deeply
- Interpreting ambiguous outputs
- Adapting approach to context

The right split: `scripts/` for mechanical operations, SKILL.md for orchestration
and judgment.

---

## Bundled Agents

For skills that dispatch subagents with specialized roles, bundle agent prompts
in `agents/`. These are not registered in the routing system — they are internal
to the skill's workflow, loaded only when the skill dispatches them.

```
skill-name/
├── SKILL.md
├── agents/
│ ├── security-reviewer.md # Prompt for the security specialist
│ ├── arch-reviewer.md # Prompt for the architecture specialist
│ └── grader.md # Prompt for output grading
├── scripts/
└── references/
```

SKILL.md references them with a dispatch instruction:
```
Spawn a subagent using the prompt in agents/security-reviewer.md.
Pass it: the diff, the package list, and the Wave 1 findings.
```

When to bundle vs. use repo-level agents:

| Scenario | Where |
|----------|-------|
| Agent only used by this skill | Bundle in `agents/` |
| Agent shared across multiple skills | Repo `agents/` directory |
| Agent needs to appear in routing | Repo `agents/` directory |

---

## Applying This Model When Creating a New Skill

1. **Write SKILL.md first** — get the workflow right without worrying about length
2. **Check length** — if over 500 lines, identify extraction candidates
3. **Extract** — move checklists, rubrics, agent prompts, templates to `references/`
4. **Replace with pointers** — each extracted section becomes one line in SKILL.md:
`"See references/X.md for the full checklist."`
5. **Identify deterministic operations** — anything the model would reinvent each
run is a script candidate; write `scripts/X.py` and replace with a `Run:` line
6. **Identify specialized roles** — if the skill dispatches agents with distinct
expertise, write their prompts in `agents/` and reference from SKILL.md

The result: a lean SKILL.md that orchestrates, and a rich `references/` + `scripts/`
+ `agents/` that delivers depth on demand.
63 changes: 26 additions & 37 deletions skills/skill-creator/references/skill-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,64 +147,53 @@ skill-name/

Bundled agents are referenced from SKILL.md: "Spawn a subagent using the prompt in `agents/grader.md`". They don't appear in the routing system — they're internal to the skill's workflow.

## Operator Context Section
## Instructions Section

Constraints belong **inline** within the workflow step where they apply, not in a
separate `## Operator Context` block. If a constraint matters during Phase 2, put
it in Phase 2 — not in a preamble 200 lines above where the model encounters it.
Explain the reasoning alongside each constraint (see "Motivation over Mandate" below).

```markdown
## Operator Context
## Instructions

This skill operates as an operator for [workflow], configuring Claude's behavior for [automation context].
### Overview

### Hardcoded Behaviors (Always Apply)
- **CLAUDE.md Compliance**: Read and follow repository CLAUDE.md files
- **Over-Engineering Prevention**: Only implement what's directly requested
- [Domain-specific constraint]
[2-3 sentences: what this skill does and how it works end-to-end]

### Default Behaviors (ON unless disabled)
- **Communication**: Show complete output, never summarize
- **Temp File Cleanup**: Remove iteration files at completion
- [Workflow-specific default]
### Phase 1: [First Phase Name]

### Optional Behaviors (OFF unless enabled)
- [Capability available on request]
[What to do here — goal and actions]

## What This Skill CAN Do
- [Explicit capability 1]
- [Explicit capability 2]
Run: `python3 ~/.claude/scripts/main.py --input {input_file}`
Expect: [Specific output format]

## What This Skill CANNOT Do
- [Limitation 1]: [Reason]
- [Limitation 2]: [Reason]
```
Gate: [Condition that must be true before moving to Phase 2]
— because [reason the gate exists]

## Instructions Section
### Phase 2: [Second Phase Name]

```markdown
## Instructions
[What to do here]

### Step 1: [First Action]
Run: `python3 ~/.claude/scripts/main.py --input {input_file}`
Expect: [Specific output format]
Validate: [How to verify success]
Constraint: [Domain-specific rule that applies HERE]
— because [why this matters in this context]

### Step 2: [Next Action]
[Continue with explicit steps]
> If SKILL.md exceeds 500 lines: extract detailed content to `references/`
> and add a one-liner here: "See `references/X.md` for the full [checklist/rubric/template]."

## Examples
### Phase 3: [Output Phase]

### Example 1: [Common Scenario]
User says: "[trigger phrase]"
Actions:
1. [Step]
2. [Step]
Result: [Concrete outcome]
[Produce the output artifact]

## Error Handling

**Error: "[Error message]"**
- Cause: [Why it happens]
- Solution: [How to fix]

## Reference Files
- `references/examples.md`: [Purpose]
- `references/examples.md`: [Purpose — loaded only when this skill executes]
- `references/checklist.md`: [Phase 2 checklist — deep content extracted from SKILL.md]
```

### Best Practices for Instructions
Expand Down
Loading