Skip to content

Conversation

@TimPietruskyRunPod
Copy link
Contributor

FlashInfer was installed for cu121/torch2.3 but vLLM 0.11.0 brings torch 2.8.0, causing binary incompatibility and import errors.

vLLM will automatically use FlashAttention or other backends.

Fixes unhealthy workers caused by FlashInfer import errors.

FlashInfer was installed for cu121/torch2.3 but vLLM 0.11.0 brings torch 2.8.0,
causing binary incompatibility and import errors.

vLLM will automatically use FlashAttention or other backends.

Fixes unhealthy workers caused by FlashInfer import errors.
@samuelexferri
Copy link

Update also to vLLM 0.12.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants