Speed up clippy by not calling empty `check_foo` methods. by nnethercote · Pull Request #157762 · rust-lang/rust

nnethercote · 2026-06-11T11:38:35Z

Most clippy lints are put into RuntimeCombinedLateLintPass. It's defined via a macro which expands out to something like this:

impl<'tcx> LateLintPass<'tcx> for RuntimeCombinedLateLintPass<'tcx> {
    check_item(&mut self, context: &LateContext<'tcx>, i: &'tcx Item<'tcx>) {
        for pass in self.passes.iter_mut() {
            pass.check_item(context, i);
        }
    }

    ... // and another 31 similar methods
}

A function like the above check_item is called (via dynamic dispatch) for every HIR node. In a default Clippy run there are 230 late lints enabled. Which means the loop executes 230 times, each time doing a call.

However, most lints only implement a small fraction of the 32 checking methods; many of them only implement one of them. Which means that most of the calls in the loop are to functions that do nothing.

(It's a similar story for early lints, but there are only ~50 of them.)

This commit changes things so that the unnecessary calls are avoided.

First, RuntimeCombined{Early,Late}LintPass are changed so instead of having a single list of passes, it has one list per check_foo method.
Then, only passes that impl check_foo get added to the check_foo list. This means we avoid calling empty check_foo methods.
This requires knowing which of the empty default check_foo methods are overridden. This commit adds a check_foo_needed method that returns a boolean to indicate this; each one default to false.
Writing check_foo_needed methods by hand would be error-prone, so the commit adds a new attribute proc macro #[runtime_lint_pass] which detects when a check_foo is defined for a pass and adds the corresponding check_foo_needed method.
Also, because each pass can end up on multiple lists within RuntimeCombined{Early,Late}LintPass, {Early,Late}LintPassObject is changed from using Box to using Rc<RefCell>. RefCell has a small overhead but this is dwarfed by the wins from avoiding empty check_foo calls.

It's a lot of plumbing changes, but it results in some huge wins: in the best cases clippy's runtime is reduced by 30%.

These structs can own the `Vec`.

By using `retain` instead of `into_iter`/`filter`/`collect`.

It doesn't define any `check_*` methods so there's no point adding it to the passes list. (It's only there for `HardwiredLints::lint_vec`.) This makes it more like `SoftLints`.

All the other paired methods in this trait have the form `check_foo`/`check_foo_post`.

Currently the use points need to handle method attributes, due to a single low-value doc comment in each macro's body. This commit moves those doc comments so the use points can be simplified. The comments are also made more accurate -- there are now multiple `_post` methods and the comments now cover all of them, not just one of them.

That reflects how they're mostly used and avoids the need for some long signatures. And it matches `{Early,Late}LintPassObject` nicely. Also make them public so that clippy can use them.

There's a lot of repetition here that can be avoided.

Most clippy lints are put into `RuntimeCombinedLateLintPass`. It's defined via a macro which expands out to something like this: ``` impl<'tcx> LateLintPass<'tcx> for RuntimeCombinedLateLintPass<'tcx> { check_item(&mut self, context: &LateContext<'tcx>, i: &'tcx Item<'tcx>) { for pass in self.passes.iter_mut() { pass.check_item(context, i); } } ... // and another 31 similar methods } ``` A function like the above `check_item` is called (via dynamic dispatch) for every HIR node. In a default Clippy run there are 230 late lints enabled. Which means the loop executes 230 times, each time doing a call. However, most lints only implement a small fraction of the 32 checking methods; many of them only implement one of them. Which means that most of the calls in the loop are to functions that do nothing. (It's a similar story for early lints, but there are only ~50 of them.) This commit changes things so that the unnecessary calls are avoided. - First, `RuntimeCombined{Early,Late}LintPass` are changed so instead of having a single list of passes, it has one list per `check_foo` method. - Then, only passes that impl `check_foo` get added to the `check_foo` list. This means we avoid calling empty `check_foo` methods. - This requires knowing which of the empty default `check_foo` methods are overridden. This commit adds a `check_foo_needed` method that returns a boolean to indicate this; each one default to `false`. - Writing `check_foo_needed` methods by hand would be error-prone, so the commit adds a new attribute proc macro `#[runtime_lint_pass]` which detects when a `check_foo` is defined for a pass and adds the corresponding `check_foo_needed` method. - Also, because each pass can end up on multiple lists within `RuntimeCombined{Early,Late}LintPass`, `{Early,Late}LintPassObject` is changed from using `Box` to using `Rc<RefCell>`. `RefCell` has a small overhead but this is dwarfed by the wins from avoiding empty `check_foo` calls. It's a lot of plumbing changes, but it results in some huge wins: in the best cases clippy's runtime is reduced by 30%.

nnethercote · 2026-06-12T10:03:09Z

clippy doesn't run on rustc-perf on CI, but I did some local measurements. First, instruction counts are great, with the best results at -15%!

nnethercote · 2026-06-12T10:04:29Z

Next, wall-time, where the results are even better. wall-time is pretty noisy on my machine so you can't trust any single number too much, but the trend is clear, with multiple results exceeding -30%:

nnethercote · 2026-06-12T10:06:35Z

Finally, why is wall-time so much better than instruction counts? There must be something microarchitectural happening. And there is... here are the branch-misses (misprediction) results, with reductions of -97% in the best case!

This is because the PR avoids so many dynamically dispatched method calls, which involve indirect branches for the vtable lookups.

nnethercote · 2026-06-12T10:10:09Z

cc @blyxyas @ada4a

nnethercote · 2026-06-12T10:10:39Z

Non-clippy runs shouldn't have their perf affected. Let's check:

@bors try @rust-timer queue

Speed up clippy by not calling empty `check_foo` methods.

nnethercote · 2026-06-12T10:13:25Z

This PR currently overlaps with #157689. My plan is for that PR to merge first, doing all the preliminary cleanups, leaving just the final (and main) commit in this PR.

rust-bors · 2026-06-12T12:21:58Z

☀️ Try build successful (CI)
Build commit: fa5d275 (fa5d275fc9625c4ca4b4a39e6b52b91e932cae49, parent: 09a371361240e42b0d69438fd1179efcf212e576)

rust-timer · 2026-06-12T13:28:58Z

Finished benchmarking commit (fa5d275): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up.

@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -0.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.8%	[-0.8%, -0.8%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.8%	[-0.8%, -0.8%]	1

Cycles

Results (primary -2.5%, secondary -3.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.5%	[-2.5%, -2.5%]	1
Improvements ✅ (secondary)	-3.0%	[-3.0%, -3.0%]	1
All ❌✅ (primary)	-2.5%	[-2.5%, -2.5%]	1

Binary size

This perf run didn't have relevant results for this metric.

Bootstrap: 517.481s -> 517.738s (0.05%)
Artifact size: 401.31 MiB -> 400.84 MiB (-0.12%)

rust-bors · 2026-06-12T16:13:59Z

☔ The latest upstream changes (presumably #157779) made this pull request unmergeable. Please resolve the merge conflicts.

nnethercote added 5 commits June 11, 2026 20:30

Remove lifetime from RuntimeCombined{Early,Late}LintPass

daf9d46

These structs can own the `Vec`.

Filter lint passes in place

344824f

By using `retain` instead of `into_iter`/`filter`/`collect`.

Remove an unnecessary 'static bound

1b33df7

Don't put HardwiredLints in lint passes

8e30bd7

It doesn't define any `check_*` methods so there's no point adding it to the passes list. (It's only there for `HardwiredLints::lint_vec`.) This makes it more like `SoftLints`.

Factor out repetitive code in register_internal

c6268ae

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 11, 2026

bjorn3 reviewed Jun 11, 2026

View reviewed changes

Comment thread compiler/rustc_lint/Cargo.toml Outdated

Rename {enter,exit}_where_predicate

325e054

All the other paired methods in this trait have the form `check_foo`/`check_foo_post`.

nnethercote force-pushed the runtime_lint_pass branch from a14a893 to 4b8ebe2 Compare June 12, 2026 05:59

nnethercote added 4 commits June 12, 2026 17:27

Put a box within {Early,Late}LintPassFactory.

ce32476

That reflects how they're mostly used and avoids the need for some long signatures. And it matches `{Early,Late}LintPassObject` nicely. Also make them public so that clippy can use them.

Streamline clippy's early/late lint list construction

8f4a59c

There's a lot of repetition here that can be avoided.

nnethercote force-pushed the runtime_lint_pass branch from 4b8ebe2 to f3d9ca4 Compare June 12, 2026 09:46

nnethercote changed the title ~~runtime_lint_pass~~ Speed up clippy by not calling empty check_foo methods. Jun 12, 2026

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 12, 2026

rust-bors Bot pushed a commit that referenced this pull request Jun 12, 2026

Auto merge of #157762 - nnethercote:runtime_lint_pass, r=<try>

fa5d275

Speed up clippy by not calling empty `check_foo` methods.

nnethercote mentioned this pull request Jun 12, 2026

Lint cleanups #157689

Open

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up clippy by not calling empty `check_foo` methods.#157762

Speed up clippy by not calling empty `check_foo` methods.#157762
nnethercote wants to merge 10 commits into
rust-lang:mainfrom
nnethercote:runtime_lint_pass

nnethercote commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

nnethercote commented Jun 12, 2026 •

edited

Loading

Uh oh!

nnethercote commented Jun 12, 2026 •

edited by ada4a

Loading

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

nnethercote commented Jun 12, 2026

Uh oh!

rust-bors Bot commented Jun 12, 2026

Uh oh!

This comment has been minimized.

rust-timer commented Jun 12, 2026

Uh oh!

rust-bors Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

nnethercote commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

nnethercote commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Jun 12, 2026 • edited by ada4a Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

nnethercote commented Jun 12, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

nnethercote commented Jun 12, 2026

Uh oh!

rust-bors Bot commented Jun 12, 2026

Uh oh!

This comment has been minimized.

rust-timer commented Jun 12, 2026

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

rust-bors Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nnethercote commented Jun 11, 2026 •

edited

Loading

nnethercote commented Jun 12, 2026 •

edited

Loading

nnethercote commented Jun 12, 2026 •

edited by ada4a

Loading