With vLLM TP > 1 and `nvidia.com/gpucores` set, vLLM is unable to run.

With vLLM TP > 2 and `nvidia.com/gpucores` set, vLLM is unable to run.

i.e
```
export CUDA_DEVICE_SM_LIMIT=40
vllm serve vllm serve /home/chauncey/qwen3-8b -tp 2 --enforce-eager
```

```
vllm bench serve --model /home/chauncey/qwen3-8b --endpoint /v1/completions --dataset-name random --random-input 5 --random-output 5  --num-prompts 1000
```

It can be observed that vLLM is unable to run. But after unsetting `CUDA_DEVICE_SM_LIMIT`, vLLM works normally.


https://github.com/Project-HAMi/HAMi-core/blob/main/src/multiprocess/multiprocess_utilization_watcher.c#L205






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With vLLM TP > 1 and `nvidia.com/gpucores` set, vLLM is unable to run. #142

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

With vLLM TP > 1 and nvidia.com/gpucores set, vLLM is unable to run. #142

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

With vLLM TP > 1 and `nvidia.com/gpucores` set, vLLM is unable to run. #142