feat: add quantization support to benchmarking script by Lothnic · Pull Request #3 · Lothnic/vllmini

Lothnic · 2026-04-27T17:10:10Z

Summary by CodeRabbit

New Features
- Added optional --quantize/-q flag to benchmark tool for enabling 4-bit model quantization.
Documentation
- Updated dequantization formula presentation in README for improved clarity.

coderabbitai · 2026-04-27T17:10:23Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR introduces a quantize boolean flag to the benchmark entrypoint, enabling optional 4-bit NF4 quantization during model loading. The flag is exposed via CLI argument and propagated through the benchmark function. Additionally, an unused import is removed from the Qwen3 model module, and documentation formatting is simplified.

Changes

Cohort / File(s)	Summary
Documentation Update `README.md`	Dequantisation Formula presentation changed from LaTeX math block to inline code styling with spacing adjustment.
Benchmark Enhancement `benchmark.py`	Added `quantize: bool` parameter to `benchmark()` function signature. Updated CLI parser with `--quantize/-q` argument and forwarded the flag to `load_hf_model()` for conditional model quantization.
Import Cleanup `models/qwen3.py`	Removed unused `Attention as LlamaAttention` import from `models.attention` module.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Lothnic/vllmini\Feature/quantisation #2: Adds matching --quantize/-q CLI argument and quantize: bool parameter to the benchmark function for conditional model quantization.

Poem

🐰 A flag to quantize, a choice so neat,
Compress the weights, make models lean and sweet,
Through CLI pipes the option flows,
Four bits of wisdom, efficiency grows! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add quantization support to benchmarking script' accurately describes the main change in the pull request—adding a quantize flag to the benchmark.py script with conditional 4-bit NF4 quantization support.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/benchmark-quant

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

feat: add quantization support to benchmarking script

5bb767f

Lothnic merged commit cf87818 into main Apr 27, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add quantization support to benchmarking script#3

feat: add quantization support to benchmarking script#3
Lothnic merged 1 commit into
mainfrom
feature/benchmark-quant

Lothnic commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lothnic commented Apr 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Lothnic commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading