Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion vllm/model_executor/layers/quantization/modelopt.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,24 @@ def get_quant_method(

def apply_vllm_mapper(self, hf_to_vllm_mapper: "WeightsMapper"):
if len(self.exclude_modules) > 0:
self.exclude_modules = hf_to_vllm_mapper.apply_list(self.exclude_modules)
# This is a workaround for the weights remapping issue:
# https://github.com/vllm-project/vllm/issues/28072
# Right now, the Nvidia ModelOpt library use just one wildcard pattern:
# module_path*
# It gets applied if the whole tree of modules rooted at module_path
# is not quantized. Here we replace such pattern by 2 patterns that are
# collectively equivalent to the original pattern:
# module_path
# module_path.*
new_exclude_modules = []
for exclude in self.exclude_modules:
if len(exclude) >= 2 and exclude[-1] == "*" and exclude[-2] != ".":
new_exclude_modules.append(exclude[:-1])
new_exclude_modules.append(exclude[:-1] + ".*")
Comment on lines +199 to +203

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Workaround expands exclusions beyond intended prefix

The new loop replaces each module_path* entry with plain module_path/module_path.* before mapping. In is_layer_excluded the legacy substring branch checks exclude_module in prefix, so these plain entries now match any module name containing the substring rather than only those starting with the ModelOpt wildcard. With configs that blacklist branches such as vision_model*, prefixes like encoder.vision_model_adapter (which would not have matched the original wildcard) will now be treated as excluded, leaving unrelated layers unquantized and reducing performance. This over‑exclusion did not occur before the trailing * was stripped.

Useful? React with 👍 / 👎.

else:
new_exclude_modules.append(exclude)
Comment on lines +201 to +205
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to identify and transform the wildcard patterns is correct. However, using direct string indexing (exclude[-1], exclude[-2], exclude[:-1]) can be less readable and potentially more error-prone for future modifications compared to using built-in string methods. Refactoring this to use endswith() and removesuffix() would make the code's intent clearer and improve its maintainability.

Suggested change
if len(exclude) >= 2 and exclude[-1] == "*" and exclude[-2] != ".":
new_exclude_modules.append(exclude[:-1])
new_exclude_modules.append(exclude[:-1] + ".*")
else:
new_exclude_modules.append(exclude)
if len(exclude) > 1 and exclude.endswith("*") and not exclude.endswith(".*"):
base = exclude.removesuffix("*")
new_exclude_modules.append(base)
new_exclude_modules.append(f"{base}.*")
else:
new_exclude_modules.append(exclude)


self.exclude_modules = hf_to_vllm_mapper.apply_list(new_exclude_modules)

@staticmethod
def get_config_filenames() -> list[str]:
Expand Down