You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/PHILOSOPHY.md
+25-12Lines changed: 25 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,7 +54,7 @@ The solution: only pull the relevant information for the specific task. This is
54
54
Three mechanisms enforce this:
55
55
-**Agents**: specialized instruction files tailored to specific domains, loaded only when their triggers match
56
56
-**Skills**: workflow methodologies that invoke deterministic scripts (Python CLIs, validation tools) rather than relying on LLM judgment alone, activated only when their workflow applies
57
-
-**Progressive Disclosure**: summary in the main file, details in `references/`subdirectory. Right context at the right time, not everything at once
57
+
-**Progressive Disclosure**: SKILL.md contains the workflow orchestration and tells the model *when* to load deep context. Detailed catalogs, agent rosters, specification tables, and output templates live in `references/`and are loaded only when the current workflow phase needs them. A skill with 26 chart types keeps the selection logic in SKILL.md and each chart's parameter spec in its own reference file — the model loads only the spec for the chart it selected. A review skill with 4 waves keeps the orchestration in SKILL.md and each wave's agent roster in a separate reference file — Wave 2 agents don't consume tokens during Wave 1
58
58
59
59
## Tokens Are Cheap, Quality Is Expensive
60
60
@@ -221,26 +221,38 @@ Everything a skill needs lives inside the skill directory. Scripts, viewer templ
221
221
222
222
```
223
223
skills/my-skill/
224
-
├── SKILL.md # The workflow
224
+
├── SKILL.md # The orchestrator — workflow + when to load references
225
225
├── agents/ # Subagent prompts used only by this skill
226
226
├── scripts/ # Deterministic CLI tools this skill invokes
227
227
├── assets/ # Templates, HTML viewers, static files
228
228
└── references/ # Deep context loaded on demand
229
229
```
230
230
231
+
**The orchestrator pattern:** SKILL.md is a thin workflow orchestrator, not a monolithic document. It tells the model *what to do* (phases, gates, decisions) and *when to load deep context* (reference files). The heavy content — detailed catalogs, agent dispatch prompts, output templates, specification tables — lives in `references/` and gets loaded only when the current phase needs it.
232
+
233
+
This is the difference between a skill that works and a skill that works *efficiently*:
234
+
235
+
| Approach | Token Cost | Quality |
236
+
|----------|-----------|---------|
237
+
| Everything in SKILL.md | High — full content loaded on every invocation | Good but wasteful |
238
+
| Thin SKILL.md, no references | Low — but missing context | Degraded — lost domain knowledge |
239
+
|**Orchestrator + references**|**Proportional to task** — load what the phase needs |**Best — full knowledge, minimal waste**|
240
+
241
+
Making a skill shorter by deleting content is not progressive disclosure — it's content loss. Progressive disclosure means the content still exists, organized so only the relevant slice enters the context window at any given phase.
242
+
243
+
**Example:** A review skill with 4 waves of agents keeps the wave orchestration logic in SKILL.md (~500 lines) and puts each wave's agent roster and dispatch prompts in separate reference files (`references/wave-1-foundation.md`, `references/wave-2-deep-dive.md`). When executing Wave 1, only the Wave 1 reference is loaded. Wave 2's agents don't consume tokens until Wave 2 begins.
244
+
231
245
**Why this matters:** A skill that depends on scripts scattered across the repo is fragile to move, hard to test, and impossible to evaluate in isolation. When everything is bundled, the skill can be:
232
246
- Copied to another project and it works
233
247
- Tested via `run_eval.py` against its own workspace
234
248
- Reviewed as a single unit — all the tooling is visible in one tree
235
249
- Deleted without orphaning dependencies elsewhere
236
250
237
-
**The exception:** Shared patterns (`shared-patterns/anti-rationalization-core.md`) are referenced across skills. These stay shared. But skill-specific scripts, assets, and agents are always bundled.
238
-
239
251
**Repo-level `scripts/`** is reserved for toolkit-wide operations (learning-db.py, sync-to-user-claude.py, INDEX generation) — tools that operate on the system as a whole, not on a single skill's workflow.
240
252
241
253
## Workflow First, Constraints Inline
242
254
243
-
Skill documents place the workflow (Instructions/Phases) immediately after the frontmatter. Constraints appear inline within the phases they govern, not in a separate upfront section.
255
+
Skill documents place the workflow (Instructions/Phases) immediately after the frontmatter. Constraints appear inline within the phases they govern, with reasoning attached ("because X"), not in a separate upfront section.
244
256
245
257
**Measured result:** A/B/C testing on Go code generation showed workflow-first ordering (C) swept constraints-first ordering (B) 3-0 across simple, medium, and complex prompts. Agent blind reviewers consistently scored workflow-first higher on testing depth, Go idioms, and benchmark coverage.
246
258
@@ -249,18 +261,19 @@ Skill documents place the workflow (Instructions/Phases) immediately after the f
249
261
```
250
262
1. YAML frontmatter (What + When)
251
263
2. Brief overview (How — one paragraph)
252
-
3. Instructions/Phases (The actual workflow, with inline constraints)
253
-
4. Benchmark/Commands Guide(Reference material)
264
+
3. Instructions/Phases (The workflow, constraints inline with reasoning)
265
+
4. Reference Material (Commands, guides — or pointers to references/)
254
266
5. Error Handling (Failure context)
255
-
6. Anti-Patterns (What went wrong before)
256
-
7. References (Pointers to deep context)
267
+
6. References (Pointers to bundled files)
257
268
```
258
269
259
-
**Why it works:** The model encounters the task structure before the constraint framework. Constraints appear at the decision point where they apply — "use table-driven tests because they make adding cases trivial" inside the testing phase, not in a separate Hardcoded Behaviors section 200 lines earlier. The model spends attention on understanding the task, not parsing a constraint taxonomy.
270
+
**Why it works:** The model encounters the task structure before any constraint framework. Constraints appear at the decision point where they apply — "use table-driven tests because they make adding cases trivial" inside the testing phase, not in a separate Hardcoded Behaviors section 200 lines earlier. Attaching reasoning ("because X") lets the model generalize constraints to situations the skill author didn't anticipate.
271
+
272
+
**What was removed:** Operator Context sections (Hardcoded/Default/Optional taxonomy), standalone Anti-Patterns sections, Anti-Rationalization tables, and Capabilities & Limitations boilerplate. These were structural overhead that separated constraints from the workflow steps where they apply.
260
273
261
-
**What moves:**The Operator Context section (Hardcoded/Default/Optional behaviors) decomposes. Each constraint migrates to the phase where it applies. "Run with -race for concurrent code" belongs in Phase 3 (RUN), not in a behavior table.
274
+
**Where the content went:**Every constraint was distributed inline to the workflow step where it matters. Anti-pattern wisdom became reasoning attached to the relevant instruction. Nothing was deleted — it was reorganized to be at point-of-use.
262
275
263
-
**What stays:**Error Handling, Anti-Patterns, and References remain at the end as context that's consulted when things go wrong — not before the model has understood what "going right" looks like.
276
+
**Progressive disclosure completes the picture:**Workflow-first ordering keeps SKILL.md navigable. For skills exceeding ~500 lines, detailed catalogs, agent rosters, and specification tables move to `references/` files. The SKILL.md workflow tells the model when to load each reference — "Read `references/wave-1-foundation.md` for the agent list and dispatch prompts." The model gets the orchestration logic upfront and loads deep context only when the current phase needs it.
0 commit comments