vLLM-v0 Upgrade #122

ehsk · 2026-01-19T19:40:41Z

This PR upgrades vLLM to a recent version where the V0 engine is not removed, which is v0.10.0 (an intermediate step to fully migrate to V1 in #121)

Notable upgrades:
python: 3.11 to 3.12
vllm: 0.8.5.post1 to 0.10.0
torch: 2.6.0 to 2.7.1
transformers: 4.51.1 to 4.57.6
flash-attention: 2.7.4.post1 to 2.8.3

GSPO (blue=v0.8.5post1, pink/purple=v0.10.0)

Logprobs	Reward

GRPO (orange=v0.8.5post1, purple=v0.10.0)

Logprobs	Reward

Potentially, the latest version to upgrade to is v0.10.2, but this error occurs from the bundled flash attention in vLLM:

[rank0]: torch.AcceleratorError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
[rank0]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank0]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank0]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Also some v0 features that we use in the code were removed in >v0.10.0 such as multi-step scheduler (it's ok in this case as we don't normally use it)

pipelinerl/torch_utils.py

pipelinerl/vllm0.py

pipelinerl/finetune_loop.py

rafapi · 2026-01-21T17:12:25Z

pipelinerl/vllm1.py


    # Run HTTP server
    sock_addr = (args.host or "", args.port)
    sock = create_server_socket(sock_addr)


Not sure when this happened, but we are running the http server twice without dropping the previous one. see line 159. we need to remove this line and the one above

Looks like it's intentional, see the comment in the first run:

# workaround to make sure that we bind the port before the engine is set up. # This avoids race conditions with ray. # see https://github.com/vllm-project/vllm/issues/8204

rafapi · 2026-01-21T17:17:08Z

pipelinerl/finetune/checkpoints.py


    if args.load_as_bf16:
-        loading_args["torch_dtype"] = torch.bfloat16
+        loading_args["dtype"] = torch.bfloat16


why this change? has the transformers API changed here?

torch_dtype became deprecated and prints a warning

rafapi · 2026-01-21T17:17:19Z

pipelinerl/finetune/lora.py


    logger.info(f"Merge lora checkpoint {lora_model_path}")
-    model = lora_load_and_merge(lora_model_path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True)
+    model = lora_load_and_merge(lora_model_path, dtype=torch.bfloat16, low_cpu_mem_usage=True)


same as above

torch_dtype renamed to dtype

rafapi

LGTM!!

ehsk added 10 commits December 23, 2025 14:31

library upgrades + vllm1 weight update changes

771a4f9

unused parameter (device_id) removed

c0dc029

transformers dtype warning fixed

28ec2cd

transformers dtype warning fixed

88d6da8

more fixes and updates

a28bdaf

minor logging changes

9eb5264

upgrade vllm but stay on v0

7fff374

flash_attention install command updated

2c21fe4

fixed typos and updated comments

98e9ebd

increased seq_length for math

a6a292c

ehsk requested a review from rafapi January 20, 2026 01:36

ehsk mentioned this pull request Jan 20, 2026

vLLM upgrade #121

Merged

ehsk self-assigned this Jan 20, 2026

rafapi reviewed Jan 21, 2026

View reviewed changes

pipelinerl/torch_utils.py Show resolved Hide resolved

rafapi reviewed Jan 21, 2026

View reviewed changes

pipelinerl/vllm0.py Show resolved Hide resolved

rafapi reviewed Jan 21, 2026

View reviewed changes

pipelinerl/finetune_loop.py Outdated Show resolved Hide resolved

ehsk added 3 commits January 21, 2026 16:22

better comments for broadcast

fd91234

asyncio removed

84309fe

better clarification

fbbaf85

rafapi reviewed Jan 21, 2026

View reviewed changes

rafapi approved these changes Jan 21, 2026

View reviewed changes

ehsk merged commit 64073e3 into main Jan 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM-v0 Upgrade #122

vLLM-v0 Upgrade #122

Uh oh!

ehsk commented Jan 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rafapi Jan 21, 2026 •

edited

Loading

Uh oh!

ehsk Jan 21, 2026

Uh oh!

rafapi Jan 21, 2026

Uh oh!

ehsk Jan 21, 2026

Uh oh!

rafapi Jan 21, 2026

Uh oh!

ehsk Jan 21, 2026

Uh oh!

rafapi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vLLM-v0 Upgrade #122

vLLM-v0 Upgrade #122

Uh oh!

Conversation

ehsk commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rafapi Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehsk Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

rafapi Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

ehsk Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

rafapi Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

ehsk Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

rafapi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ehsk commented Jan 19, 2026 •

edited

Loading

rafapi Jan 21, 2026 •

edited

Loading