Skip to content

Commit 92de98e

Browse files
authored
feat: additive include filters, external prompts, and XML output format (#20)
* fix: make --include-dir and --include-files work additively Previously, when both --include-dir and --include-files were specified, includeFiles would clear the includeDirs patterns, making them mutually exclusive. Now they combine additively - files from both filters are included in the output. * test: add E2E tests for combined --include-dir and --include-files - Add integration tests verifying additive behavior of includeDirs + includeFiles - Add sample-rust-project in playground/ for manual testing - Tests verify: tree output, prompt, file inclusion, ignores working correctly * fix: support external prompt file paths and prepend prompts without placeholder - External paths (containing / or \) are now used directly instead of being nested under codefetch/prompts/ - Prompts without {{CURRENT_CODEBASE}} placeholder are prepended to the codebase content instead of replacing it * feat: add XML tags for structured output sections - Wrap prompts in <task>...</task> tags - Wrap file tree in <filetree>...</filetree> tags - Wrap source code in <source_code>...</source_code> tags This provides better structure for AI models to understand the different sections of the codebase output. * docs: update README and changelogs for v2.2.0 - Document XML-structured output format with <task>, <filetree>, <source_code> tags - Document additive --include-dir and --include-files behavior - Document external prompt file path support - Update all changelogs (root, cli, sdk) for v2.2.0 * chore: update pnpm-lock.yaml * docs: add pnpm lockfile troubleshooting to HOW-TO-RELEASE * fix: use workspace:* for codefetch-sdk dependency
1 parent 6bad976 commit 92de98e

28 files changed

Lines changed: 485 additions & 48 deletions

CHANGELOG.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
11
# Changelog
22

3+
## 2.2.0
4+
5+
### Added
6+
- **XML-structured output format** - Output now uses semantic XML tags for better AI parsing:
7+
- `<task>...</task>` - Wraps the prompt/instructions
8+
- `<filetree>...</filetree>` - Wraps the project tree structure
9+
- `<source_code>...</source_code>` - Wraps all source code files
10+
- **Additive `--include-dir` and `--include-files`** - These options now work together additively instead of being mutually exclusive. Use both to include specific directories PLUS specific files.
11+
- **External prompt file support** - Prompt files with paths (e.g., `-p docs/arch/prompt.md`) are now correctly resolved from the project root instead of requiring them to be in `codefetch/prompts/`
12+
13+
### Fixed
14+
- Fixed `--include-dir` and `--include-files` being mutually exclusive - now they combine additively
15+
- Fixed external prompt file paths not being found when containing directory separators
16+
- Fixed prompts without `{{CURRENT_CODEBASE}}` placeholder not including the codebase content
17+
318
## 2.1.2
419

520
### Fixed

HOW-TO-RELEASE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,13 @@ Delete the GitHub Release if needed: `gh release delete vX.Y.Z`
160160
- `npm ERR! code E403` or auth failures: run `npm login` and retry
161161
- `gh` failures: `gh auth status`; ensure `repo` scope exists
162162
- Tag push rejected: pull/rebase or fast-forward `main`, then rerun
163+
- **CI fails with `ERR_PNPM_OUTDATED_LOCKFILE`**: The lockfile is out of sync with `package.json`. This happens when dependencies change (e.g., `workspace:*``^2.1.0`). Fix it locally:
164+
```bash
165+
pnpm install --no-frozen-lockfile
166+
git add pnpm-lock.yaml
167+
git commit -m "chore: update pnpm-lock.yaml"
168+
git push
169+
```
163170

164171
## Release Frequency Suggestions
165172

README.md

Lines changed: 52 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,10 @@ npx codefetch --include-files "src/components/AgentPanel.tsx,src/lib/llm/**/*" -
129129

130130
# Include src directory, exclude test files
131131
npx codefetch --include-dir src --exclude-files "*.test.ts" -o src-no-tests.md
132+
133+
# Combine --include-dir and --include-files (additive!)
134+
# This includes ALL files from crates/core/src PLUS the specific lib.rs file
135+
npx codefetch --include-dir crates/core/src --include-files "crates/engine/src/lib.rs" -o combined.md
132136
```
133137

134138
Dry run (only output to console)
@@ -289,10 +293,20 @@ Inline prompts are automatically appended with the codebase content.
289293

290294
#### Custom Prompt Files
291295

292-
Create custom prompts in `codefetch/prompts/` directory:
296+
You can use custom prompt files in two ways:
293297

294-
1. Create a markdown file (e.g., `codefetch/prompts/my-prompt.md`)
295-
2. Use it with `--prompt my-prompt.md`
298+
**1. External prompt files (anywhere in your project):**
299+
```bash
300+
# Use a prompt file from anywhere in your project
301+
npx codefetch -p docs/arch/review-prompt.md
302+
npx codefetch --prompt ./prompts/security-audit.txt
303+
```
304+
305+
**2. Prompt files in `codefetch/prompts/` directory:**
306+
```bash
307+
# Create codefetch/prompts/my-prompt.md, then use:
308+
npx codefetch --prompt my-prompt.md
309+
```
296310

297311
You can also set a default prompt in your `codefetch.config.mjs`:
298312

@@ -344,6 +358,41 @@ Codefetch uses a set of default ignore patterns to exclude common files and dire
344358

345359
You can view the complete list of default patterns in [default-ignore.ts](packages/sdk/src/default-ignore.ts).
346360

361+
## Output Format
362+
363+
Codefetch generates structured output using semantic XML tags to help AI models better understand the different sections:
364+
365+
```xml
366+
<task>
367+
Your prompt or instructions here...
368+
</task>
369+
370+
<filetree>
371+
Project Structure:
372+
└── src
373+
├── index.ts
374+
└── utils
375+
└── helpers.ts
376+
</filetree>
377+
378+
<source_code>
379+
src/index.ts
380+
```typescript
381+
// Your code here
382+
```
383+
384+
src/utils/helpers.ts
385+
```typescript
386+
// More code here
387+
```
388+
</source_code>
389+
```
390+
391+
The XML structure provides:
392+
- `<task>` - Contains your prompt/instructions (from `-p` flag)
393+
- `<filetree>` - Contains the project tree visualization (from `-t` flag)
394+
- `<source_code>` - Contains all the source code files with their paths
395+
347396
## Token Counting
348397
349398
Codefetch supports different token counting methods to match various AI models:

packages/cli/CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
## 2.2.0
4+
5+
### Added
6+
- **XML-structured output format** - Output now uses semantic XML tags for better AI parsing:
7+
- `<task>...</task>` - Wraps the prompt/instructions
8+
- `<filetree>...</filetree>` - Wraps the project tree structure
9+
- `<source_code>...</source_code>` - Wraps all source code files
10+
- **External prompt file support** - Prompt files with paths (e.g., `-p docs/arch/prompt.md`) are now correctly resolved from the project root
11+
12+
### Fixed
13+
- Fixed `getPromptFile` not resolving external file paths correctly when they contain directory separators
14+
315
## 2.1.2
416

517
### Fixed

packages/cli/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
"dependencies": {
4242
"@clack/prompts": "^0.11.0",
4343
"c12": "^2.0.1",
44-
"codefetch-sdk": "^2.1.0",
44+
"codefetch-sdk": "workspace:*",
4545
"consola": "^3.3.3",
4646
"ignore": "^7.0.0",
4747
"mri": "^1.2.0",

packages/cli/src/commands/default.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,15 @@ function getPromptFile(
2929
if (VALID_PROMPTS.has(config.defaultPromptFile)) {
3030
return config.defaultPromptFile;
3131
}
32+
// Check if it's an external file path (contains path separator or is absolute)
33+
// External paths should be used as-is, not nested under codefetch/prompts/
34+
if (
35+
config.defaultPromptFile.includes("/") ||
36+
config.defaultPromptFile.includes("\\") ||
37+
config.defaultPromptFile.startsWith(".")
38+
) {
39+
return resolve(config.defaultPromptFile);
40+
}
3241
return resolve(config.outputPath, "prompts", config.defaultPromptFile);
3342
}
3443

packages/cli/src/commands/open.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,15 @@ function getPromptFile(
3535
if (VALID_PROMPTS.has(config.defaultPromptFile)) {
3636
return config.defaultPromptFile;
3737
}
38+
// Check if it's an external file path (contains path separator or is absolute)
39+
// External paths should be used as-is, not nested under codefetch/prompts/
40+
if (
41+
config.defaultPromptFile.includes("/") ||
42+
config.defaultPromptFile.includes("\\") ||
43+
config.defaultPromptFile.startsWith(".")
44+
) {
45+
return resolve(config.defaultPromptFile);
46+
}
3847
return resolve(config.outputPath, "prompts", config.defaultPromptFile);
3948
}
4049

packages/cli/test/integration/codebase-fixture.test.ts

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,4 +267,87 @@ describe("Integration: codebase-test fixture", () => {
267267
expect(content).toContain("button.js");
268268
expect(content).toContain("utils");
269269
});
270+
271+
it("combines --include-dir and --include-files additively", () => {
272+
const result = spawnSync(
273+
"node",
274+
[
275+
cliPath,
276+
"-o",
277+
"combined-include.md",
278+
"--include-dir",
279+
"src/utils",
280+
"--include-files",
281+
"src/components/button.js",
282+
"-t",
283+
"3",
284+
],
285+
{
286+
cwd: FIXTURE_DIR,
287+
encoding: "utf8",
288+
stdio: ["inherit", "pipe", "pipe"],
289+
}
290+
);
291+
292+
expect(result.stderr).toBe("");
293+
expect(result.stdout).toContain("Output written to");
294+
295+
const outPath = join(CODEFETCH_DIR, "combined-include.md");
296+
expect(fs.existsSync(outPath)).toBe(true);
297+
298+
const content = fs.readFileSync(outPath, "utf8");
299+
// Should include files from utils directory (via --include-dir)
300+
expect(content).toContain("test1.ts");
301+
expect(content).toContain("test2.js");
302+
// Should include specific file (via --include-files)
303+
expect(content).toContain("button.js");
304+
// Should NOT include other files not matching the patterns
305+
expect(content).not.toContain("app.js");
306+
expect(content).not.toContain("header.js");
307+
expect(content).not.toContain("container.js");
308+
// Project tree should be present
309+
expect(content).toMatch(/Project Structure:/);
310+
});
311+
312+
it("combines multiple --include-dir directories with --include-files", () => {
313+
const result = spawnSync(
314+
"node",
315+
[
316+
cliPath,
317+
"-o",
318+
"multi-dir-include.md",
319+
"--include-dir",
320+
"src/utils,src/components/base",
321+
"--include-files",
322+
"src/app.js",
323+
"-t",
324+
"3",
325+
],
326+
{
327+
cwd: FIXTURE_DIR,
328+
encoding: "utf8",
329+
stdio: ["inherit", "pipe", "pipe"],
330+
}
331+
);
332+
333+
expect(result.stderr).toBe("");
334+
expect(result.stdout).toContain("Output written to");
335+
336+
const outPath = join(CODEFETCH_DIR, "multi-dir-include.md");
337+
expect(fs.existsSync(outPath)).toBe(true);
338+
339+
const content = fs.readFileSync(outPath, "utf8");
340+
// Should include files from utils directory
341+
expect(content).toContain("test1.ts");
342+
expect(content).toContain("test2.js");
343+
// Should include files from components/base directory
344+
expect(content).toContain("container.js");
345+
// Should include the specific file app.js
346+
expect(content).toContain("app.js");
347+
// Should NOT include files from other directories
348+
expect(content).not.toContain("button.js");
349+
expect(content).not.toContain("header.js");
350+
// Project tree should be present
351+
expect(content).toMatch(/Project Structure:/);
352+
});
270353
});

packages/cli/test/unit/markdown.test.ts

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ const UTILS_DIR = join(FIXTURE_DIR, "src/utils");
88

99
describe("generateMarkdown with chunk-based token limit", () => {
1010
it("enforces maxTokens by chunk-based reading", async () => {
11-
const MAX_TOKENS = 50;
11+
// Note: XML tags (<source_code>, </source_code>) add ~4 tokens overhead
12+
const MAX_TOKENS = 55;
1213
const files = [join(UTILS_DIR, "test1.ts"), join(UTILS_DIR, "test2.js")];
1314

1415
const result = await generateMarkdown(files, {
@@ -72,6 +73,12 @@ describe("generateMarkdown with chunk-based token limit", () => {
7273
disableLineNumbers: false,
7374
});
7475

76+
// Check for XML tags
77+
expect(markdown).toContain("<filetree>");
78+
expect(markdown).toContain("</filetree>");
79+
expect(markdown).toContain("<source_code>");
80+
expect(markdown).toContain("</source_code>");
81+
// Check content
7582
expect(markdown).toContain("Project Structure:");
7683
expect(markdown).toMatch(/ /);
7784
expect(markdown).toContain("test1.ts");
@@ -81,16 +88,17 @@ describe("generateMarkdown with chunk-based token limit", () => {
8188
it("respects token limits with project tree", async () => {
8289
const files = [join(UTILS_DIR, "test1.ts")];
8390

91+
// Note: XML tags (<filetree>, <source_code>) add overhead
8492
const markdown = await generateMarkdown(files, {
85-
maxTokens: 20,
93+
maxTokens: 40,
8694
verbose: 0,
8795
projectTree: 2,
8896
tokenEncoder: "simple",
8997
disableLineNumbers: false,
9098
});
9199

92100
const tokens = await countTokens(markdown, "simple");
93-
expect(tokens).toBeLessThanOrEqual(20);
101+
expect(tokens).toBeLessThanOrEqual(40);
94102
});
95103
});
96104

packages/sdk/CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,19 @@
11
# Changelog
22

3+
## 2.2.0
4+
5+
### Added
6+
- **XML-structured output format** - Output now uses semantic XML tags for better AI parsing:
7+
- `<task>...</task>` - Wraps the prompt/instructions
8+
- `<filetree>...</filetree>` - Wraps the project tree structure
9+
- `<source_code>...</source_code>` - Wraps all source code files
10+
- **Additive `includeDirs` and `includeFiles`** - These options now work together additively instead of being mutually exclusive
11+
- Added `hasCodebasePlaceholder` helper function to `template-parser.ts`
12+
13+
### Fixed
14+
- Fixed `collectFiles` treating `includeDirs` and `includeFiles` as mutually exclusive - now they combine additively
15+
- Fixed `processPromptTemplate` to prepend prompts without `{{CURRENT_CODEBASE}}` placeholder to the codebase content instead of replacing it
16+
317
## 2.0.4
418

519
### Added

0 commit comments

Comments
 (0)