Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion vllm/engine/arg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1394,7 +1394,9 @@ def create_engine_config(
)

model_config = self.create_model_config()
self.model = model_config.model
# if streaming from cloud storage, do not overwrite self.model with local dir
if not (is_cloud_storage(self.model) and self.load_format=="runai_streamer"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition to prevent overwriting self.model only checks for load_format=="runai_streamer". This is incomplete as it misses other runai_streamer variants like runai_streamer_sharded. If a user specifies load_format="runai_streamer_sharded" with a cloud storage model path, self.model will be incorrectly overwritten with a local directory, leading to the same failure this PR aims to fix. Using startswith("runai_streamer") would be more robust and cover all related formats.

Suggested change
if not (is_cloud_storage(self.model) and self.load_format=="runai_streamer"):
if not (is_cloud_storage(self.model) and self.load_format.startswith("runai_streamer")):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other modes we need to consider as well like tokenizer? I'm not sure which load_formats actually skip tensors being downloaded and use streaming

self.model = model_config.model
self.tokenizer = model_config.tokenizer

self._check_feature_supported(model_config)
Expand Down
Loading