Skip to content

Add dynamic KVCache and metadata resolution to support Qwen3 serving.#208

Open
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_938946803
Open

Add dynamic KVCache and metadata resolution to support Qwen3 serving.#208
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_938946803

Conversation

@copybara-service

Copy link
Copy Markdown

Add dynamic KVCache and metadata resolution to support Qwen3 serving.

This change enables Raiden-LM serving support for the Qwen3 model family:

  • Scans entry parameter list dynamically to resolve request_distribution index.
  • Dynamically derives KVCacheManager layer count and block slice_byte_size from HLO argument shapes.
  • Connects the max_tokens benchmark flag to SessionConfig.

This change enables Raiden-LM serving support for the Qwen3 model family:
- Scans entry parameter list dynamically to resolve request_distribution index.
- Dynamically derives KVCacheManager layer count and block slice_byte_size from HLO argument shapes.
- Connects the max_tokens benchmark flag to SessionConfig.

PiperOrigin-RevId: 938946803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant