fix(producer): pass resolved config to encodeFramesFromDir so FFMPEG_ENCODE_TIMEOUT_MS is honored#1352
Conversation
…ENCODE_TIMEOUT_MS is honored encodeFramesFromDir takes a trailing config param carrying ffmpegEncodeTimeout, but the encode stage called it with only 5 of its 6 arguments, so the timeout always fell back to the hardcoded 600000ms default and FFMPEG_ENCODE_TIMEOUT_MS was silently ignored. Any encode exceeding 600s wall time (e.g. vp9 alpha on CPU-only machines) was deterministically killed with signal 15. Pass `job.config.producerConfig ?? resolveConfig()` at the call site, matching the existing pattern in renderOrchestrator. The GIF encode path had the same flaw — falling back to DEFAULT_CONFIG instead of the env-aware resolveConfig() — and is fixed the same way. Fixes heygen-com#1348
jrusso1020
left a comment
There was a problem hiding this comment.
Bug is real and the fix is correct. Verified against the engine source.
Root cause confirmed
encodeFramesFromDir signature at packages/engine/src/services/chunkEncoder.ts:377-383 takes config?: Partial<Pick<EngineConfig, "ffmpegEncodeTimeout">> as the 6th arg, and reads it at :428 as config?.ffmpegEncodeTimeout ?? DEFAULT_CONFIG.ffmpegEncodeTimeout. With config undefined, this is unconditionally DEFAULT_CONFIG.ffmpegEncodeTimeout = 600,000ms, so FFMPEG_ENCODE_TIMEOUT_MS env override was dead-on-arrival on this code path. PR description is accurate.
Fix is correct
job.config.producerConfig ?? resolveConfig() matches the existing pattern in renderOrchestrator.ts:1521. producerConfig is only populated on distributed-render inputs (parent already resolved env), so the resolveConfig() fallback restores env-aware behavior for in-process renders. Same pattern applied to the GIF path's timeout — that one was previously hardcoding DEFAULT_CONFIG.ffmpegEncodeTimeout directly, also bypassing env. Good catch including it in scope.
Follow-up worth flagging (not blocking)
encodeFramesChunkedConcat (the chunked-concat encode path at packages/engine/src/services/chunkEncoder.ts:484) has no config parameter AND no per-chunk timeout at all — the inner ffmpeg.spawn just waits on close. So when enableChunkedEncode === true (closed-GOP path for distributed renders), FFMPEG_ENCODE_TIMEOUT_MS still has no effect. Same shape of bug, different surface. Worth a separate PR adding a timeout (and the config param) to that function; matches what this PR did for the non-chunked path.
Test coverage
PR body says "encodeStage.test.ts — 2/2 pass" — existing tests pass but no new assertion that the config is now threaded through. Worth a unit test that mocks encodeFramesFromDir, sets FFMPEG_ENCODE_TIMEOUT_MS=12345, and asserts the 6th argument has ffmpegEncodeTimeout === 12345. Locks in the contract so a future refactor doesn't silently re-break it. Low priority since the fix is mechanically obvious; nit-level.
CI
Only the WIP check ran (success). Other required checks haven't fired on this PR yet — worth a re-trigger before merge.
Summary
Small, surgical, correct fix to a real silent bug. Approve. Suggest a follow-up to extend the same fix to encodeFramesChunkedConcat for parity.
Review by Jerrai
…meout Covers the heygen-com#1348 regression surface: asserts the encode stage passes producerConfig as the trailing config argument to encodeFramesFromDir when present, falls back to resolveConfig() for in-process renders, and threads the resolved timeout into both GIF encode passes. Verified the suite fails (3/3) against the pre-fix call site.
|
Added the unit test suggested in review (964abe5): asserts the trailing config arg reaches encodeFramesFromDir with the producerConfig timeout when present, falls back to resolveConfig() for in-process renders, and covers both GIF passes. Verified it fails 3/3 against the pre-fix call site. Will follow up with a separate PR for the missing timeout in encodeFramesChunkedConcat. |
miguel-heygen
left a comment
There was a problem hiding this comment.
Review
Verdict: superseded by #1361
The fix itself is correct — job.config.producerConfig ?? resolveConfig() as the 6th arg threads the env-resolved timeout through. Tests properly capture the config argument and verify both the producerConfig-present and absent paths, plus GIF two-pass.
However, PR #1361 from the same author includes this exact fix plus extends the config param to encodeFramesChunkedConcat (which had its own problem: raw spawn("ffmpeg", ...) with no timeout at all). If we merge this one, #1361's encodeStage.ts changes will conflict.
Recommendation: Close this in favor of #1361, which is the complete fix.
|
Closing in favor of #1361, which contains this fix plus the chunked-path timeout — as recommended in review. Thanks for the thorough reviews on both. |
Summary
Fixes #1348.
encodeFramesFromDirtakes a trailingconfigparameter carryingffmpegEncodeTimeout, but the encode stage called it with only 5 of its 6 arguments. Inside, the timeout resolved asconfig?.ffmpegEncodeTimeout ?? DEFAULT_CONFIG.ffmpegEncodeTimeout→ always 600000 ms, soFFMPEG_ENCODE_TIMEOUT_MSwas silently ignored on this path and any encode exceeding 600s wall time (e.g. vp9 alpha on CPU-only machines) was deterministically killed with signal 15.Changes
encodeStage.ts: passjob.config.producerConfig ?? resolveConfig()as the 6th argument toencodeFramesFromDir, matching the existing pattern inrenderOrchestrator.ts.producerConfigis only populated on distributed-render inputs, so for in-process renders the env-awareresolveConfig()is the fallback that restores the documentedFFMPEG_ENCODE_TIMEOUT_MSoverride behavior.DEFAULT_CONFIGinstead of the env-awareresolveConfig()— and is fixed the same way.Verification
bun run --filter '@hyperframes/producer' typecheck— passesbun test src/services/render/stages/encodeStage.test.ts— 2/2 passchunkEncoder.test.ts— 63/63 passoxlint/oxfmt --checkclean on the changed file