Skip to content

fix autoscheme accuracy drop bug w/o low_gpu, add CI test#1658

Open
WeiweiZhang1 wants to merge 3 commits intomainfrom
fix_autoscheme_acc_drop
Open

fix autoscheme accuracy drop bug w/o low_gpu, add CI test#1658
WeiweiZhang1 wants to merge 3 commits intomainfrom
fix_autoscheme_acc_drop

Conversation

@WeiweiZhang1
Copy link
Copy Markdown
Contributor

Description

bugfix for:
#1654

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>
Copilot AI review requested due to automatic review settings April 3, 2026 10:06
Signed-off-by: WeiweiZhang1 <weiwei1.zhang@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an AutoScheme scoring regression that causes accuracy degradation when running with low_gpu_mem_usage=False, and adds a CUDA test intended to prevent the regression from reoccurring.

Changes:

  • Enable gradient-based scoring (grad_mode=True) for AutoScheme wrapper layers in the non-low_gpu_mem_usage scoring path.
  • Add a CUDA regression test comparing accuracy between low_gpu_mem_usage=True and False for mixed-bit AutoScheme.
  • Add a CLI warning about potential issues using --enable_torch_compile with AutoScheme.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
auto_round/auto_scheme/delta_loss.py Restores gradient-driven scoring behavior when low_gpu_mem_usage=False by enabling wrapper grad_mode.
test/test_cuda/algorithms/test_auto_scheme.py Adds a regression test to compare accuracy parity across low_gpu_mem_usage modes.
auto_round/__main__.py Warns users about potential torch.compile issues when using AutoScheme.

Copy link
Copy Markdown
Contributor

@wenhuach21 wenhuach21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks

@wenhuach21 wenhuach21 changed the title fix autoscheme accuracy drop bug, add CI test fix autoscheme accuracy drop bug w/o low_gpu, add CI test Apr 3, 2026
@wenhuach21
Copy link
Copy Markdown
Contributor

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

**common_kwargs,
)
model_baseline, _ = ar_baseline.quantize()
acc_baseline = evaluate_accuracy(model_baseline, ar_baseline.tokenizer, task="piqa", limit=200)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test is not robust if we introduce bugs that affect both cases. It’s better to use a reference float value and verify both results against it, ensuring that the accuracy meets or exceeds this threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants