Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .changeset/shared-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
"@maastrich/hashup": minor
---

Share the hash cache across entries — shared utilities are now read and
hashed exactly once per `hashup` invocation instead of once per entry.

- `HashupOptions` gains `cache?: HashupCache` and `resolver?: Resolver`.
Pass the same values across multiple `hashup()` calls to reuse work.
The CLI's config mode does this automatically: every named entry in a
single `hashup` invocation now shares one cache + one resolver.
- New exports: `createHashupCache()`, `collectReachable()`, and the
`HashupCache` type.
- Hash output is byte-identical to 0.5.0 on every existing input — the
inline snapshot in `tests/examples.test.ts` still matches. Shared cache
is a pure dedupe, not a semantic change.

**Targeted break for direct `hashFile` callers:** the cache parameter is
now `HashupCache` (an object with `hashes` and `deps` maps) instead of
`Map<string, string[]>`. `hashup()` itself is untouched — the new
`cache` option is additive. Callers of `hashFile` directly should swap
`new Map<string, string[]>()` for `createHashupCache()`.
30 changes: 30 additions & 0 deletions docs/api/hashup.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,20 @@ interface HashupOptions {
* @default "silent"
*/
logLevel?: "silent" | "warn" | "info" | "debug";

/**
* Optional shared cache. Pass the same cache across multiple calls
* to dedupe work — a file visited by entry A is reused by entry B.
* Create with `createHashupCache()`.
*/
cache?: HashupCache;

/**
* Optional shared `enhanced-resolve` resolver. Pass a shared
* instance to reuse its internal filesystem cache across calls.
* Create with `createResolver()`.
*/
resolver?: Resolver;
}
```

Expand Down Expand Up @@ -64,6 +78,22 @@ console.log(result.hash);
console.log(result.files);
```

### Sharing a cache across entries

```ts
import { createHashupCache, createResolver, hashup } from "@maastrich/hashup";

const cache = createHashupCache();
const resolver = createResolver();

const app = await hashup("./src/app.ts", { cache, resolver });
const worker = await hashup("./src/worker.ts", { cache, resolver });
// Files imported by both app and worker are read + hashed exactly once.
```

`result.files` still returns only the files reachable from each call's
own roots (entry + extras), not the full cache contents.

## Notes

- The `files` array contains every file that was hashed, including files reached
Expand Down
7 changes: 7 additions & 0 deletions docs/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ lower-level utilities for advanced use cases.
internally. Exports `type LogLevel` and `type Logger`.
- [`isInNodeModules(file)`](./utilities#isinnodemodules) — predicate used by
the hasher to decide whether to walk into a resolved path.
- [`createHashupCache()`](./utilities#createhashupcache) — build a
`HashupCache` to share across multiple `hashup()` or `hashFile()` calls.
- [`collectReachable(roots, cache)`](./utilities#createhashupcache) — rebuild
a per-call file list from the cache's dependency edges.

## Config (subpath export)

Expand Down Expand Up @@ -53,6 +57,9 @@ import {
createLogger,
isLogLevel,
isInNodeModules,
createHashupCache,
collectReachable,
type HashupCache,
type Logger,
type LogLevel,
} from "@maastrich/hashup";
Expand Down
35 changes: 28 additions & 7 deletions docs/api/utilities.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,22 +42,43 @@ Parses a file's source with `es-module-lexer` and returns its static import
specifiers. Type-only imports and dynamic imports with non-literal specifiers
are excluded.

## createHashupCache

```ts
interface HashupCache {
hashes: Map<string, string[]>;
deps: Map<string, string[]>;
}

function createHashupCache(): HashupCache;
function collectReachable(roots: readonly string[], cache: HashupCache): string[];
```

An in-memory cache scoped to one consumer's lifetime — not persisted,
not shared across processes. `hashes` stores each file's flattened
hash list; `deps` stores each file's direct resolved dependency paths.
Pass the same `HashupCache` to multiple `hashup()` or `hashFile()` calls
to dedupe work. `collectReachable` walks `deps` iteratively to rebuild
a per-call file list (used internally by `hashup()` to produce
`result.files`).

## hashFile

```ts
function hashFile(
file: string,
cache: Map<string, string[]>,
cache: HashupCache,
resolver: Resolver,
logger?: Logger,
): Promise<string[]>;
```

Hashes a file and all its transitive static imports. Results are memoized in
`cache` — pass the same `Map` across multiple calls to dedupe work. On error
(file read or parse failure) the failure is sent through `logger.warn` and an
empty array is returned. `logger` defaults to a silent logger; build one with
[`createLogger`](#createlogger) when you want diagnostics on stderr.
`cache` — pass the same `HashupCache` across multiple calls to dedupe work.
On error (file read or parse failure) the failure is sent through
`logger.warn` and an empty array is returned. `logger` defaults to a silent
logger; build one with [`createLogger`](#createlogger) when you want
diagnostics on stderr.

Imports that resolve into `node_modules` are treated as opaque: the resolved
path is skipped, its files are never read, and the dependency's own imports
Expand Down Expand Up @@ -111,10 +132,10 @@ and hashing the result. Order-sensitive — pass hashes in a stable order.
## Composing Your Own Pipeline

```ts
import { createResolver, hashFile, combineHashes } from "@maastrich/hashup";
import { combineHashes, createHashupCache, createResolver, hashFile } from "@maastrich/hashup";

const resolver = createResolver();
const cache = new Map<string, string[]>();
const cache = createHashupCache();

const entries = ["./src/a.ts", "./src/b.ts"];
const allHashes: string[] = [];
Expand Down
16 changes: 14 additions & 2 deletions src/cli/run-config-mode.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import { dirname, resolve } from "node:path";
import { createHashupCache, type HashupCache } from "../lib/cache.js";
import { combineHashes } from "../lib/combine-hashes.js";
import { createResolver } from "../lib/create-resolver.js";
import { hashup, type HashupResult } from "../lib/hashup.js";
import type { LogLevel } from "../lib/logger.js";
import { expandPaths } from "./expand-paths.js";
Expand Down Expand Up @@ -33,6 +35,12 @@ export async function runConfigMode(input: RunConfigModeInput): Promise<RunConfi
fromFile: loaded.data.baseDir,
});

// One shared cache + resolver for every entry in this invocation:
// files imported by multiple named entries (shared utilities, common
// types) are read and hashed once instead of once per entry.
const cache = createHashupCache();
const resolver = createResolver();

const results: Record<string, HashupResult> = {};
for (const [name, entry] of Object.entries(loaded.data.entries)) {
const baseDir = entry.baseDir !== undefined ? resolveFrom(configDir, entry.baseDir) : rootBase;
Expand All @@ -45,7 +53,7 @@ export async function runConfigMode(input: RunConfigModeInput): Promise<RunConfi
}
const extras = entry.extras ? await expandPaths(entry.extras, baseDir) : [];
const logLevel = input.logLevel ?? loaded.data.logLevel;
results[name] = await hashEntrySet(entryFiles, extras, baseDir, logLevel);
results[name] = await hashEntrySet(entryFiles, extras, baseDir, logLevel, cache, resolver);
}

return {
Expand Down Expand Up @@ -83,12 +91,16 @@ async function hashEntrySet(
extras: string[],
baseDir: string,
logLevel: LogLevel | undefined,
cache: HashupCache,
resolver: ReturnType<typeof createResolver>,
): Promise<HashupResult> {
const perFile: HashupResult[] = [];
for (let i = 0; i < entryFiles.length; i++) {
const entry = entryFiles[i]!;
const options =
i === 0 && extras.length > 0 ? { extras, baseDir, logLevel } : { baseDir, logLevel };
i === 0 && extras.length > 0
? { extras, baseDir, logLevel, cache, resolver }
: { baseDir, logLevel, cache, resolver };
perFile.push(await hashup(entry, options));
}

Expand Down
1 change: 1 addition & 0 deletions src/index.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
export { hashup, type HashupOptions, type HashupResult } from "./lib/hashup.js";
export { createLogger, isLogLevel, type Logger, type LogLevel } from "./lib/logger.js";
export { isInNodeModules } from "./lib/is-in-node-modules.js";
export { createHashupCache, collectReachable, type HashupCache } from "./lib/cache.js";
export { combineHashes } from "./lib/combine-hashes.js";
export { createContentHash } from "./lib/create-content-hash.js";
export { createResolver } from "./lib/create-resolver.js";
Expand Down
48 changes: 48 additions & 0 deletions src/lib/cache.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/**
* In-memory memoization for the hasher. Scoped to one consumer's
* lifetime — not persisted, not shared across processes.
*
* Passing the same cache to multiple `hashup()` calls dedupes both
* work (a file visited by entry A is reused by entry B) and
* computation (the file's content hash is recomputed at most once).
*
* Two parallel maps keyed by absolute file path:
* - `hashes`: the flattened hash list (self + transitive deps).
* Returned directly to callers and combined into the final digest.
* - `deps`: the file's direct resolved dependency paths. Used by
* `collectReachable` to rebuild the per-call file list without
* re-walking the graph.
*/
export interface HashupCache {
hashes: Map<string, string[]>;
deps: Map<string, string[]>;
}

export function createHashupCache(): HashupCache {
return { hashes: new Map(), deps: new Map() };
}

/**
* Compute the transitive closure of `roots` against the cache's direct
* dependency edges. Iterative — never recurses — so deep graphs cannot
* blow the stack.
*/
export function collectReachable(roots: readonly string[], cache: HashupCache): string[] {
const visited = new Set<string>();
const stack: string[] = [];
for (let i = 0; i < roots.length; i++) {
stack.push(roots[i] as string);
}
while (stack.length > 0) {
const file = stack.pop() as string;
if (visited.has(file)) continue;
visited.add(file);
const depList = cache.deps.get(file);
if (!depList) continue;
for (let i = 0; i < depList.length; i++) {
const d = depList[i] as string;
if (!visited.has(d)) stack.push(d);
}
}
return Array.from(visited);
}
24 changes: 15 additions & 9 deletions src/lib/hash-file.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import type { Resolver } from "enhanced-resolve";
import { type HashupCache } from "./cache.js";
import { createContentHash } from "./create-content-hash.js";
import { extractImports } from "./extract-imports.js";
import { isInNodeModules } from "./is-in-node-modules.js";
Expand All @@ -9,41 +10,45 @@ import { resolveImport } from "./resolve-import.js";

export async function hashFile(
file: string,
cache: Map<string, string[]>,
cache: HashupCache,
resolver: Resolver,
logger: Logger = createLogger("silent"),
): Promise<string[]> {
const cached = cache.get(file);
const cached = cache.hashes.get(file);
if (cached) {
return cached;
}

try {
const content = await readFileContent(file);
const hashes = [createContentHash(content)];
// Seed the cache before recursing so that circular imports terminate:
// on a cycle A → B → A, the revisit of A returns this placeholder
// instead of walking forever until the stack blows.
cache.set(file, hashes);
const deps: string[] = [];
// Seed both caches before recursing so circular imports terminate:
// on a cycle A → B → A, the revisit of A hits `cache.hashes` and
// returns the placeholder instead of walking forever.
cache.hashes.set(file, hashes);
cache.deps.set(file, deps);

const imports = await extractImports(file, content);
const dependencyHashes = await hashDependencies(imports, file, cache, resolver, logger);
const dependencyHashes = await hashDependencies(imports, file, cache, resolver, logger, deps);
pushAll(hashes, dependencyHashes);

return hashes;
} catch (error) {
logger.warn(`Failed to hash file ${file}:`, error);
cache.delete(file);
cache.hashes.delete(file);
cache.deps.delete(file);
return [];
}
}

async function hashDependencies(
imports: string[],
sourceFile: string,
cache: Map<string, string[]>,
cache: HashupCache,
resolver: Resolver,
logger: Logger,
deps: string[],
): Promise<string[]> {
const hashes: string[] = [];

Expand All @@ -59,6 +64,7 @@ async function hashDependencies(
logger.debug(`Skipping node_modules dependency: ${resolved}`);
continue;
}
deps.push(resolved);
const resolvedHashes = await hashFile(resolved, cache, resolver, logger);
pushAll(hashes, resolvedHashes);
}
Expand Down
Loading
Loading