Skip to content

support WOQ model input, such as kimi2.5#1642

Draft
xin3he wants to merge 11 commits intomainfrom
xinhe/3-31a
Draft

support WOQ model input, such as kimi2.5#1642
xin3he wants to merge 11 commits intomainfrom
xinhe/3-31a

Conversation

@xin3he
Copy link
Copy Markdown
Contributor

@xin3he xin3he commented Mar 31, 2026

Description

Please briefly describe your main changes, the motivation.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: Xin He <xin3.he@intel.com>
Copilot AI review requested due to automatic review settings March 31, 2026 03:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial support for WOQ (weight-only quantization) model inputs by extending the weight-type detection/conversion framework and adding a CPU test that exercises loading and converting a WOQ model to high precision.

Changes:

  • Introduces ModuleWeightType.WOQ and registers a new WOQHandler for detection/conversion in auto_round/utils/weight_handler.py.
  • Adds a new CPU test (test_w4a16) that loads a WOQ model and asserts WOQ detection + conversion behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
auto_round/utils/weight_handler.py Adds WOQ enum value and a new WOQHandler intended to detect/convert WOQ quantized layers.
test/test_cpu/advanced/test_low_precision_input_model.py Adds a new test case for a WOQ (w4a16) model path and validates detection/conversion.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@xin3he xin3he marked this pull request as draft April 2, 2026 03:00
xin3he and others added 9 commits April 2, 2026 05:12
Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
…h failure (#1621)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hemes in utils and tests (#1643)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
… during module conversion

Signed-off-by: Xin He <xin3.he@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants