fix(redis): timeout hung fetchers in cachedFetchJson/cachedFetchJsonWithMeta#3672
fix(redis): timeout hung fetchers in cachedFetchJson/cachedFetchJsonWithMeta#3672fuleinist wants to merge 1 commit into
Conversation
…ithMeta What was wrong: - cachedFetchJson and cachedFetchJsonWithMeta store in-flight promises in a Map keyed by cache key. If a fetcher never settles (Transport.fetch with no internal AbortController / no signal timeout), the entry stays in the Map for the entire Vercel Edge isolate lifetime. - Every concurrent and subsequent caller for that key gets the same unresolved promise and hangs forever. The inflight entry is only cleaned up in .finally(), which never runs for a never-settling promise. What this fixes: - Wraps the fetcher in Promise.race with a 30-second timeout. If the fetcher doesn't settle within FETCHER_TIMEOUT_MS, the race rejects and the .finally() block runs to clean up the inflight entry. - Applied to both cachedFetchJson and cachedFetchJsonWithMeta. - FETCHER_TIMEOUT_MS is exported so callers can verify the constant or override it in tests. ## Changes - server/_shared/redis.ts: add FETCHER_TIMEOUT_MS (30s) and wrap fetcher in Promise.race in both functions - tests/redis-caching.test.mjs: add 'inflight timeout guard' describe block with 3 tests (hung fetcher rejection for both functions + fast fetcher sanity check) ## Testing - [x] Unit test added (inflight timeout guard suite, 3 tests) - [x] All 21 tests pass - [x] Existing tests unchanged Fixes koala73#3539
|
@fuleinist is attempting to deploy a commit to the World Monitor Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR fixes the inflight-map poisoning bug by wrapping fetchers in
Confidence Score: 3/5The production fix is logically correct but ships two defects: uncancelled timers on every successful cache miss and two tests that will fail in CI due to a mismatched elapsed-time assertion. The core Both Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant cachedFetchJson
participant inflightMap as inflight Map
participant Redis
participant Fetcher
participant Timer as setTimeout (30s)
Caller->>cachedFetchJson: cachedFetchJson(key, ttl, fetcher)
cachedFetchJson->>Redis: getCachedJson(key)
Redis-->>cachedFetchJson: null (cache miss)
cachedFetchJson->>inflightMap: "get(key) -> undefined"
cachedFetchJson->>Timer: setTimeout(reject, 30000) [handle NOT saved]
cachedFetchJson->>Fetcher: fetcher()
cachedFetchJson->>inflightMap: set(key, promise)
alt Fetcher resolves before timeout
Fetcher-->>cachedFetchJson: result
cachedFetchJson->>Redis: setCachedJson(key, result, ttl)
cachedFetchJson->>inflightMap: delete(key) [finally]
Note over Timer: Timer still live for ~30s (handle not cleared)
cachedFetchJson-->>Caller: result
else Fetcher hangs - timeout fires
Timer-->>cachedFetchJson: reject(TimeoutError)
cachedFetchJson->>cachedFetchJson: .catch() logs warning
cachedFetchJson->>inflightMap: delete(key) [finally]
cachedFetchJson-->>Caller: throw TimeoutError
end
|
| // guard is not working. | ||
| assert.ok(elapsed < 5000, `timeout should fire well under 5s, got ${elapsed}ms`); | ||
| } finally { | ||
| globalThis.fetch = originalFetch; | ||
| restoreEnv(); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Elapsed-time assertions will always fail at the real timeout
Both hung-fetcher tests assert elapsed < 5000 (5 s), but FETCHER_TIMEOUT_MS is 30_000 ms. importRedisFresh() performs a real dynamic import of the unmodified module — no fake timers, no constant override. The test runner will wait ~30 s for the timeout to fire, then the assert.ok(elapsed < 5000, ...) assertion fails with "timeout should fire well under 5s, got ~30000ms". To fix this the tests need either (a) fake/mock timers so the 30-second setTimeout fires immediately, or (b) a test-specific override of FETCHER_TIMEOUT_MS via importPatchedTsModule.
Summary
Fixes a bug where a fetcher that never settles (no internal AbortController / no signal timeout) would poison the inflight Map for the entire Vercel Edge isolate lifetime. Both
cachedFetchJsonandcachedFetchJsonWithMetanow race the fetcher against a 30-second timeout, ensuring the.finally()cleanup always runs.Problem
cachedFetchJsonandcachedFetchJsonWithMetacoalesce concurrent cache-miss requests by storing in-flight promises in aMap<string, Promise<unknown>>. If a fetcher never resolves or rejects (e.g., a Transport.fetch with no timeout signal), the entry is only cleaned up in.finally()— which never runs. Every concurrent and subsequent caller for that cache key gets the same unresolved promise and hangs indefinitely.Solution
Wrap the fetcher in
Promise.race([fetcher(), timeoutPromise]). If the fetcher doesn't settle withinFETCHER_TIMEOUT_MS(30 seconds), the race rejects, the.catch()block logs the warning, and the.finally()block cleans up the inflight entry. Applied to both functions.FETCHER_TIMEOUT_MSis exported for testability.Testing
Files Changed
server/_shared/redis.ts— added FETCHER_TIMEOUT_MS constant and Promise.race wrappers in both functionstests/redis-caching.test.mjs— added 'inflight timeout guard' describe block (3 tests)Fixes #3539