support compressed-tensors refactor by xin3he · Pull Request #1595 · intel/auto-round

xin3he · 2026-03-23T03:34:11Z

Description

compressed_tensors PR vllm-project/compressed-tensors#610 (commit 927f6d5, Mar 17 2026) removed CompressedLinear as a functional class. The class is now a stub that raises ValueError from from_linear().

Models loaded with the new compressed_tensors produce regular torch.nn.Linear modules with quantization_scheme/quantization_status attributes instead of CompressedLinear wrappers.

Type of Change

Related Issues

Fixes or relates to #1578

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Signed-off-by: Xin He <xin3.he@intel.com>

Copilot

Pull request overview

Updates AutoRound’s compressed_tensors integration to handle the upstream removal of functional CompressedLinear wrappers by detecting/processing quantized torch.nn.Linear modules that carry quantization_scheme / quantization_status metadata.

Changes:

Updated weight-type handlers to detect quantized modules via quantization_scheme (new compressed_tensors behavior) in addition to legacy CompressedLinear/compressor logic.
Added decompress_module(...) handling for new-style quantized Linear modules in FP8 and NVFP4 conversions.
Re-enabled/adjusted CPU tests to validate FP8/MXFP4 models without relying on CompressedLinear type checks (NVFP4 still skipped due to upstream issue).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
`auto_round/utils/weight_handler.py`	Extends detection/conversion to support new compressed_tensors quantized `Linear` modules.
`test/test_cpu/advanced/test_low_precision_input_model.py`	Updates assertions to match new model module types/attributes and unskips FP8/MXFP4 coverage.

auto_round/utils/weight_handler.py

test/test_cpu/advanced/test_low_precision_input_model.py

Signed-off-by: Xin He <xin3.he@intel.com>

yiliu30 · 2026-03-24T01:45:53Z

Code review

Found 1 issue:

FP8Handler.detect_layer uses .data_type (lines 497, 499) to access quantization scheme fields, but compressed_tensors QuantizationArgs only has a .type field. The other three handlers (MXFP4, MXFP8, NVFP4) in this same PR correctly use .type. This will raise AttributeError at runtime when detecting FP8-BLOCK models via the new quantization_scheme path.

auto-round/auto_round/utils/weight_handler.py

Lines 496 to 500 in e99111f

    
               q_scheme.weights.num_bits == 8 
        
               and "float" in q_scheme.weights.data_type 
        
               and q_scheme.input_activations.num_bits == 8 
        
               and "float" in q_scheme.input_activations.data_type 
        
           ):

Also, is the new quantization_scheme-based detection compatible with the older compressed_tensors versions that still use CompressedLinear? Want to make sure we don't break backward compatibility.

Otherwise the approach looks good -- approve once the above is addressed.

yiliu30

Approach looks good. One bug to fix (.data_type -> .type in FP8Handler) and a question about backward compat with older compressed_tensors versions -- see comment.

Signed-off-by: Xin He <xin3.he@intel.com>

chensuyue · 2026-03-24T04:21:04Z

Remove the version limit here to test your code,

auto-round/test/test_cpu/requirements.txt

Line 13 in 88824a1

compressed-tensors==0.14.1a20260313 # temporary pin for llmcompressor

xin3he · 2026-03-24T05:33:03Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-03-24T05:33:12Z

Azure Pipelines successfully started running 1 pipeline(s).

xin3he · 2026-03-24T07:21:29Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-03-24T07:21:39Z

Azure Pipelines successfully started running 1 pipeline(s).

xin3he · 2026-03-24T07:29:13Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-03-24T07:29:22Z

Azure Pipelines successfully started running 1 pipeline(s).

support compressed-tensors refactor

4a59570

Signed-off-by: Xin He <xin3.he@intel.com>

Copilot AI review requested due to automatic review settings March 23, 2026 03:34

Copilot started reviewing on behalf of xin3he March 23, 2026 03:34 View session

xin3he requested review from Copilot and yiliu30 and removed request for Copilot March 23, 2026 03:34

fix pylint

e99111f

Signed-off-by: Xin He <xin3.he@intel.com>

Copilot started reviewing on behalf of xin3he March 23, 2026 07:57 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

chensuyue added this to the 0.12.0 milestone Mar 23, 2026

fix bug

b23967f

Signed-off-by: Xin He <xin3.he@intel.com>

yiliu30 approved these changes Mar 24, 2026

View reviewed changes

xin3he added 3 commits March 24, 2026 10:04

fix typo

c9affc2

Signed-off-by: Xin He <xin3.he@intel.com>

Merge branch 'main' into xinhe/3-20a

4010e33

compatible for llmc integration

0c66d67

Signed-off-by: Xin He <xin3.he@intel.com>

xin3he added 2 commits March 24, 2026 13:32

Update requirements.txt

eaa62bd

Merge branch 'main' into xinhe/3-20a

5859e1e

xin3he added 2 commits March 24, 2026 15:20

Update requirements.txt

c0137cf

Update requirements.txt

b2ceaa8

XuehaoSun approved these changes Mar 24, 2026

View reviewed changes

XuehaoSun merged commit 19b5237 into main Mar 24, 2026
38 of 40 checks passed

XuehaoSun deleted the xinhe/3-20a branch March 24, 2026 12:13

Conversation

xin3he commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiliu30 commented Mar 24, 2026

Code review

Uh oh!

yiliu30 left a comment

Choose a reason for hiding this comment

Uh oh!

chensuyue commented Mar 24, 2026

Uh oh!

xin3he commented Mar 24, 2026

Uh oh!

azure-pipelines bot commented Mar 24, 2026

Uh oh!

xin3he commented Mar 24, 2026

Uh oh!

azure-pipelines bot commented Mar 24, 2026

Uh oh!

xin3he commented Mar 24, 2026

Uh oh!

azure-pipelines bot commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xin3he commented Mar 23, 2026 •

edited

Loading