Skip to content

Commit 90c7c8b

Browse files
Update vllm/v1/worker/gpu_model_runner.py
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
1 parent 3fe5d0e commit 90c7c8b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2618,7 +2618,7 @@ def _determine_batch_execution_and_padding(
26182618
ubatch_slices, num_tokens_across_dp = None, None
26192619
if self.vllm_config.parallel_config.data_parallel_size > 1:
26202620
# Disable DP padding when running eager to avoid excessive padding when
2621-
# running prefills. This lets us set enforce_eager on the prefiller in
2621+
# running prefills. This lets us set cudagraph_mode="NONE" on the prefiller in
26222622
# a P/D setup and still use CUDA graphs (enabled by this padding) on the
26232623
# decoder.
26242624
allow_dp_padding = (

0 commit comments

Comments
 (0)