Skip to content

With vLLM TP > 1 and nvidia.com/gpucores set, vLLM is unable to run. #142

@chaunceyjiang

Description

@chaunceyjiang

With vLLM TP > 2 and nvidia.com/gpucores set, vLLM is unable to run.

i.e

export CUDA_DEVICE_SM_LIMIT=40
vllm serve vllm serve /home/chauncey/qwen3-8b -tp 2 --enforce-eager
vllm bench serve --model /home/chauncey/qwen3-8b --endpoint /v1/completions --dataset-name random --random-input 5 --random-output 5  --num-prompts 1000

It can be observed that vLLM is unable to run. But after unsetting CUDA_DEVICE_SM_LIMIT, vLLM works normally.

https://github.com/Project-HAMi/HAMi-core/blob/main/src/multiprocess/multiprocess_utilization_watcher.c#L205

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions