Skip to content

[Merged by Bors] - feat(Tactic): add #defeq_abuse tactic combinator#35750

Closed
kim-em wants to merge 31 commits intomasterfrom
kim/check-defeq-abuse
Closed

[Merged by Bors] - feat(Tactic): add #defeq_abuse tactic combinator#35750
kim-em wants to merge 31 commits intomasterfrom
kim/check-defeq-abuse

Conversation

@kim-em
Copy link
Contributor

@kim-em kim-em commented Feb 25, 2026

This PR adds #defeq_abuse in, a tactic/command combinator for diagnosing backward.isDefEq.respectTransparency failures. It runs the given tactic or command with the option both true and false, and when the strict setting fails, walks the trace tree to identify the specific isDefEq checks that are root causes.

Reusable trace-tree analysis utilities are extracted into Mathlib.Lean.MessageData.Trace:

  • TraceResult / traceResultOf — structured access to the ✅/❌/💥 status encoding
  • foldTraceNodes — generic fold over MessageData trace trees
  • stripTraceResultPrefix / dedupByString — helpers for trace content comparison

Companion lean4 PRs that will simplify this further:

See https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency

🤖 Prepared with Claude Code

kim-em and others added 2 commits February 25, 2026 04:43
`check_defeq_abuse tac` runs `tac` with
`backward.isDefEq.respectTransparency` both `true` and `false`. If the
tactic succeeds with `false` but fails with `true`, it identifies the
specific `isDefEq` checks that are the root causes of the failure.

This is a debugging tool for diagnosing where Mathlib relies on
definitional equality that isn't available at instance transparency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the t-meta Tactics, attributes or user commands label Feb 25, 2026
@github-actions
Copy link

github-actions bot commented Feb 25, 2026

PR summary d300915bf1

Import changes for modified files

No significant changes to the import graph

Import changes for all files
Files Import difference
There are 6696 files with changed transitive imports taking up over 297229 characters: this is too many to display!
You can run scripts/import_trans_difference.sh all locally to see the whole output.

Declarations diff

+ A
+ B
+ Gr'
+ MyAction
+ MyPred
+ MySub₂
+ MySub₂.toAddSubgroup
+ Num'
+ Sm'
+ TraceResult
+ VisitStep
+ Zo'
+ analyzeTraces
+ b
+ dedupByString
+ extractInstName
+ instance : Gr' Int
+ instance : Membership ℕ (MyPred ℕ) where mem s a := s a
+ instance : Singleton ℕ (MyPred ℕ) where singleton a x := x = a
+ instance : Zo' Int
+ instance {G : Type} [AddCommGroup G] : CoeSort (MySub₂ G) Type
+ instance {n} : Num' Int n
+ instance {α} [Zo' α] : Num' α 0
+ instance {α} [h : Gr' α] : Sm' α := { h with }
+ myOp
+ myPredBoundedOrder
+ myPredCompleteLattice
+ myPredDistribLattice
+ myPred_disjoint_singleton_right
+ mySub₂AddCommGroup
+ mySub₂AddCommMonoid
+ mySub₂MyAction
+ myTestFun
+ onlyOnDefEqNodes
+ reportDefEqAbuse
+ stripTraceResultPrefix
+ testCoersion
+ testVirtualParent
+ testVirtualParentFixed
+ test_zo_cmd_abuse
+ traceResultOf
+ visitWithAndAscendM
+ visitWithM
+ zo_eq_iff
++ instance {V : Type} [AddCommGroup V] [Module ℝ V] {l : Submodule ℝ V} :

You can run this locally as follows
## summary with just the declaration names:
./scripts/declarations_diff.sh <optional_commit>

## more verbose report:
./scripts/declarations_diff.sh long <optional_commit>

The doc-module for scripts/declarations_diff.sh contains some details about this script.


Increase in tech debt: (relative, absolute) = (1.00, 0.00)
Current number Change Type
11957 1 backward.isDefEq

Current commit 93584a9453
Reference commit d300915bf1

You can run this locally as

./scripts/technical-debt-metrics.sh pr_summary
  • The relative value is the weighted sum of the differences with weight given by the inverse of the current value of the statistic.
  • The absolute value is the relative value divided by the total sum of the inverses of the current values (i.e. the weighted average of the differences).

kim-em and others added 14 commits February 25, 2026 05:00
- Filter trace analysis to Meta.isDefEq trace class (Name.isPrefixOf)
- Replace foldlM with for loops for better array mutation optimization
- Use Std.HashSet for dedup instead of linear Array.contains scan
- Rethrow internal exceptions (interrupts, heartbeat) instead of swallowing them
- Restore trace state before rethrowing in the tracing catch block

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a command-mode `#check_defeq_abuse` combinator that diagnoses
`backward.isDefEq.respectTransparency` failures in commands like
`instance` declarations, where type class synthesis failures occur
during elaboration rather than within a tactic. Groups isDefEq root
causes by synthesis application for structured output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…'s test

- Command mode syntax is now `#check_defeq_abuse in cmd` (with `in`)
- Tactic mode stays as `check_defeq_abuse tac` (no `#`, since `#` conflicts
  with Lean's `#check` parsing in tactic position)
- Add Weiyi's `Module.Free ℝ l` example as a command-mode test that
  actually detects synthesis failures

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…command)

Shorter names. The tactic form can't use `#` because Lean's parser
treats `#` as a command prefix even in tactic position.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `elab` instead of `syntax` + `elab_rules` for the tactic form,
following the `#adaptation_note` pattern, so that `#` works in tactic
position.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lures

The command mode was reporting all instance applications that fail with
strict transparency, including ones that would also fail with permissive
(e.g., `Submodule =?= AddSubgroup`, `↥l =?= ℝ`). Now we also trace the
permissive run and only report applications that succeed there but fail
with strict transparency — the actual transparency-related failures.

For Weiyi's `Module.Free ℝ l` example, this reduces the output from
8 instance applications to just 1 (`@Submodule.module`).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace custom `hasSubstr` with `String.contains`/`startsWith`
- Extract `foldTraces` to handle MessageData structural recursion
- Extract `extractInstName` (was duplicated in two places)
- Make all trace-walking helpers `private`
- Add comment explaining why string matching is necessary
  (TraceData has no structured success/failure field)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Combine the "check" and "trace" passes by enabling tracing from the
start. The old runAndCheck becomes runAndCapture, returning both the
result and the new messages generated during the run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add two self-contained synthetic test cases that exercise `#defeq_abuse`
without depending on library defeq abuses (which will be fixed upstream):

1. **Tactic mode (MyPred/Pi diamond):** A type alias `MyPred α := α → Prop`
   with a `DistribLattice` instance inheriting from Pi, a lemma elaborated
   with that instance, and a later `CompleteLattice` instance creating an
   instance diamond. The `rw` tactic fails under strict transparency because
   the lemma and goal use different paths to `PartialOrder`.

2. **Command mode (MySub₂/virtual parent):** A structure extending
   `AddSubmonoid` with a plain `def toAddSubgroup` (virtual parent pattern),
   creating a diamond between `AddCommMonoid` from `toAddSubmonoid`
   (projection) and from `toAddSubgroup` (opaque def).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
kim-em and others added 2 commits February 25, 2026 13:26
Find "transition points" (isDefEq checks that fail strict but succeed
permissive) instead of all leaf failures. This filters out false
positives from simp exploration. Also rename CheckDefEqAbuse → DefEqAbuse
and import from Tactic.Common.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…in #defeq_abuse

Reduce tactic-mode elaboration from 5 passes to 3 (matching command mode):
strict+trace, permissive+trace, final permissive run.

Improve transition-point detection to filter out false positives by also
collecting permissive failures. A check is only reported as a transition
point if it succeeds in the permissive trace AND does not also fail in the
permissive trace (which would indicate context-dependent, not
transparency-dependent, behavior).

Add test for the "fails regardless of transparency" branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@SnirBroshi
Copy link
Collaborator

SnirBroshi commented Feb 25, 2026

Since this is specific to respectTransparency and not all defeq abuse, is it possible to rename the new tactic, in order to reserve #defeq_abuse for checking for defeq abuses everywhere?

Perhaps #defeq_abuse_implicits, since that setting is specifically about defeqs in implicit arguments?
Or #defeq_abuse +onlyImplicits in.

kim-em and others added 3 commits February 26, 2026 00:29
- Remove unnecessary `tk` from `getRef`/`%$tk`, use `logInfo`/`logWarning`
  instead of `logInfoAt tk`/`logWarningAt tk`
- Use `s.restore (restoreInfo := true)` to discard stale info trees from
  diagnostic trial runs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the three test cases that depend on current Mathlib defeq abuses
(Set.disjoint_singleton_right, Module.Free Submodule, sorriedResult/simp)
since these will break once the upstream fixes land.

Replace with module-docs showing the original Mathlib code that triggered
each issue, followed by self-contained test cases that reproduce the same
patterns. Add Matthew Jasper's Mathlib-free minimization of the sub_nonneg
zero instance mismatch. Add a test showing that `@[implicit_reducible]`
on the virtual parent def fixes the synthesis failure.

Imports reduced from 7 to 3.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show that `@[implicit_reducible]` on the virtual parent def resolves the
synthesis failure, and that a `unif_hint` equating `Num'.fromNat` across
different instances resolves the zero mismatch during `rw`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kim-em kim-em changed the title feat(Tactic): add check_defeq_abuse tactic combinator feat(Tactic): add #defeq_abuse tactic combinator Feb 26, 2026
kim-em and others added 3 commits February 26, 2026 02:06
Align naming with lean4 PR #12698 (TraceResult in TraceData):
- TraceStatus → TraceResult
- traceStatusOf → traceResultOf (now returns Option TraceResult)
- stripTraceStatusPrefix → stripTraceResultPrefix
- Add .error case for bomb emoji (exceptions ≠ failures)
- Add source comments noting future simplifications from lean4#12698/#12699

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without this, `theorem` proofs with `by` blocks are elaborated
asynchronously, so `elabCommand` returns before errors appear in
`messages`, causing false negatives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kim-em kim-em marked this pull request as ready for review February 26, 2026 07:53
kim-em and others added 5 commits February 26, 2026 07:59
Add missing `import Mathlib.Init` to `Mathlib.Lean.MessageData.Trace`
and replace bare `sorry` with `test_sorry` axiom in test file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…35833)

* feat: refactor for readability

* chore: fix up

* chore: fix up more

* choreL remove stray comment
@thorimur
Copy link
Collaborator

For maintainers without context: this PR contributes a diagnostic tool from Kim & their Claude to assist with managing the breaking changes associated with backwards.isDefeq.respectTransparency.

I recommend reading the recent messages in the thread on Zulip for context, but in short:

This is not intended as a feature so much as an adaptation tool for immediate use. So, it’s being held to different standards—but with clear disclaimers to users about this difference-in-standards in the module docstrings and hovers (see the review PR #35843 if it has not been merged into this branch yet), so that it does get confused for a stable user-facing tool.

That difference makes this maintainer delegate quite unusual for a number of reasons:

  • the code has not been fully reviewed to work as expected
  • the code is not yet sufficiently maintainable
  • I myself have contributed a significant refactor to try to take one step towards making it more readable, but this has not been reviewed by someone else

Though, to balance this out for the circumstances, I have:

  • made sure there are disclaimers visible which make clear that this is subject to change and should not be relied on in any fashion
  • read through it and checked that it is safe to run at a basic level (it does not wipe your hard drive or obviously crash lean)
  • reviewed the documentation in the module with public declarations (Lean.MessageData.Trace) and ensured that most declarations are private

Given that it fulfills different obligations than an ordinary meta PR, has time-sensitive value (i.e. it is most useful immediately, while this breakage is still affecting Mathlib), and is flagged to users as experimental/subject to change (and therefore does not make the same promises about behavior or maintainability as ordinary meta code), I think—pending merging #35843 into this branch—that it's reasonable to:



maintainer delegate?

@github-actions
Copy link

🚀 Pull request has been placed on the maintainer queue by thorimur.

@mathlib-triage mathlib-triage bot added the maintainer-merge A reviewer has approved the changed; awaiting maintainer approval. label Feb 27, 2026
@thorimur
Copy link
Collaborator

Also, please remember to update the PR description and title before merging :)

Copy link
Contributor

@Vierkantor Vierkantor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cursory glance and this looks good, once the changes of #35843 are merged.

bors d+

@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Feb 27, 2026

✌️ kim-em can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

@mathlib-triage mathlib-triage bot added delegated This pull request has been delegated to the PR author (or occasionally another non-maintainer). and removed maintainer-merge A reviewer has approved the changed; awaiting maintainer approval. labels Feb 27, 2026
* chore: warnings

* chore: docs, tidying, move declarations

* chore: make decls private by default

* chore: todo, remove unnecessary non-async
@kim-em
Copy link
Contributor Author

kim-em commented Mar 1, 2026

bors merge

@mathlib-triage mathlib-triage bot added the ready-to-merge This PR has been sent to bors. label Mar 1, 2026
mathlib-bors bot pushed a commit that referenced this pull request Mar 1, 2026
This PR adds `#defeq_abuse in`, a tactic/command combinator for diagnosing `backward.isDefEq.respectTransparency` failures. It runs the given tactic or command with the option both `true` and `false`, and when the strict setting fails, walks the trace tree to identify the specific `isDefEq` checks that are root causes.

Reusable trace-tree analysis utilities are extracted into `Mathlib.Lean.MessageData.Trace`:
- `TraceResult` / `traceResultOf` — structured access to the ✅/❌/💥 status encoding
- `foldTraceNodes` — generic fold over `MessageData` trace trees
- `stripTraceResultPrefix` / `dedupByString` — helpers for trace content comparison

Companion lean4 PRs that will simplify this further:
- leanprover/lean4#12698 — structured `TraceResult` field on `TraceData`
- leanprover/lean4#12699 — `Meta.synthInstance.apply` trace class

See https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency

🤖 Prepared with Claude Code

Co-authored-by: Thomas R. Murrills <68410468+thorimur@users.noreply.github.com>
github-merge-queue bot pushed a commit to leanprover/lean4 that referenced this pull request Mar 1, 2026
This PR gives the `generate` function's "apply @foo to Goal" trace nodes
their own trace sub-class `Meta.synthInstance.apply` instead of sharing
the parent `Meta.synthInstance` class.

This allows metaprograms that walk synthesis traces to distinguish
instance application attempts from other synthesis nodes by checking
`td.cls` rather than string-matching on the header text.

The new class is registered with `inherited := true`, so `set_option
trace.Meta.synthInstance true` continues to show these nodes.

Motivated by mathlib's `#defeq_abuse` diagnostic tactic
(leanprover-community/mathlib4#35750) which
currently checks `headerStr.contains "apply"` to identify these nodes.
See
https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency

🤖 Prepared with Claude Code

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Mar 1, 2026

Pull request successfully merged into master.

Build succeeded:

@mathlib-bors mathlib-bors bot changed the title feat(Tactic): add #defeq_abuse tactic combinator [Merged by Bors] - feat(Tactic): add #defeq_abuse tactic combinator Mar 1, 2026
@mathlib-bors mathlib-bors bot closed this Mar 1, 2026
@mathlib-bors mathlib-bors bot deleted the kim/check-defeq-abuse branch March 1, 2026 07:31
mathlib-bors bot pushed a commit that referenced this pull request Mar 2, 2026
This PR marks `Submodule.toAddSubgroup` as `@[reducible]`, matching the reducibility that a real structure parent projection would have (Lean marks structure parent projections `[reducible]`). This resolves `backward.isDefEq.respectTransparency` failures where Lean needs to see that `s.toAddSubgroup.toAddSubmonoid = s.toAddSubmonoid`.

The motivating example is:
```lean
instance {V : Type} [AddCommGroup V] [Module ℝ V] {l : Submodule ℝ V} :
    Module.Free ℝ l := Module.Free.of_divisionRing ℝ l
```
which fails with `backward.isDefEq.respectTransparency` because synthesis of `Module ℝ ↥l` needs `Submodule.module`, whose `AddCommMonoid` comes through `toAddSubgroup` (a virtual parent), while the `Module` prerequisite expects `AddCommMonoid` from `toAddSubmonoid` (a real projection). Lean must unfold `toAddSubgroup` to see these agree, but as a plain `def` it was `[semireducible]` and invisible at instance transparency.

See [#general > backward.isDefEq.respectTransparency](https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency/near/575748939) and the synthetic test case in #35750.
pfaffelh pushed a commit to pfaffelh/mathlib4 that referenced this pull request Mar 2, 2026
…ity#35750)

This PR adds `#defeq_abuse in`, a tactic/command combinator for diagnosing `backward.isDefEq.respectTransparency` failures. It runs the given tactic or command with the option both `true` and `false`, and when the strict setting fails, walks the trace tree to identify the specific `isDefEq` checks that are root causes.

Reusable trace-tree analysis utilities are extracted into `Mathlib.Lean.MessageData.Trace`:
- `TraceResult` / `traceResultOf` — structured access to the ✅/❌/💥 status encoding
- `foldTraceNodes` — generic fold over `MessageData` trace trees
- `stripTraceResultPrefix` / `dedupByString` — helpers for trace content comparison

Companion lean4 PRs that will simplify this further:
- leanprover/lean4#12698 — structured `TraceResult` field on `TraceData`
- leanprover/lean4#12699 — `Meta.synthInstance.apply` trace class

See https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency

🤖 Prepared with Claude Code

Co-authored-by: Thomas R. Murrills <68410468+thorimur@users.noreply.github.com>
pfaffelh pushed a commit to pfaffelh/mathlib4 that referenced this pull request Mar 2, 2026
…y#35761)

This PR marks `Submodule.toAddSubgroup` as `@[reducible]`, matching the reducibility that a real structure parent projection would have (Lean marks structure parent projections `[reducible]`). This resolves `backward.isDefEq.respectTransparency` failures where Lean needs to see that `s.toAddSubgroup.toAddSubmonoid = s.toAddSubmonoid`.

The motivating example is:
```lean
instance {V : Type} [AddCommGroup V] [Module ℝ V] {l : Submodule ℝ V} :
    Module.Free ℝ l := Module.Free.of_divisionRing ℝ l
```
which fails with `backward.isDefEq.respectTransparency` because synthesis of `Module ℝ ↥l` needs `Submodule.module`, whose `AddCommMonoid` comes through `toAddSubgroup` (a virtual parent), while the `Module` prerequisite expects `AddCommMonoid` from `toAddSubmonoid` (a real projection). Lean must unfold `toAddSubgroup` to see these agree, but as a plain `def` it was `[semireducible]` and invisible at instance transparency.

See [#general > backward.isDefEq.respectTransparency](https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/backward.2EisDefEq.2ErespectTransparency/near/575748939) and the synthetic test case in leanprover-community#35750.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

delegated This pull request has been delegated to the PR author (or occasionally another non-maintainer). ready-to-merge This PR has been sent to bors. t-meta Tactics, attributes or user commands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants