Skip to content

Conversation

@afonso360
Copy link
Contributor

@afonso360 afonso360 commented Aug 18, 2023

👋 Hey,

This transform is wrong for the inputs in the attached testcase. Both of these instructions differ with their NaN behaviours.

It was introduced in #5546 and has been sort of unnoticed until #6843 caused some wasmtime testcode to fire this optimization rule and produce this counterexample.

@jameysharp has also a nice example of how the testcode now fires this rule (with #6843) where it hadn't before.

I have no idea how fuzzgen hasn't found this yet! It's a really simple input so it should have been found pretty much instantly.

I've been trying to look at why over the past week but haven't come to any conclusions yet. Right now I'm trying to look into the nan canonicalization pass and trying to check if it would mask this behavior.

CC: @alexcrichton

This transform is wrong for the inputs in the attached testcase.

It was introduced in bytecodealliance#5546 and has been sort of unnoticed until bytecodealliance#6843 caused
some wasmtime testcode to fire this optimization rule and produce this
counterexample.
@afonso360 afonso360 requested a review from a team as a code owner August 18, 2023 18:53
@afonso360 afonso360 requested review from abrown and removed request for a team August 18, 2023 18:53
@jameysharp
Copy link
Contributor

Comparing this existing test in cranelift/filetests/filetests/runtests/fmin-pseudo.clif:

; run: %fmin_p_f32(0x0.0, +NaN) == 0x0.0

against the new test in this PR:

; run: %a(0.0, NaN) == NaN

I wouldn't think that NaN canonicalization could lead to us mixing up these results.

Oh hey, I think the rule in question was only barely wrong. I think with the arguments swapped it might work:

(rule (simplify
       (select ty (fcmp _ (FloatCC.LessThan) x y) x y))
      (fmin_pseudo ty y x))

According to the Cranelift fmin_pseudo docs, the definition is fmin_pseudo(a, b) = (b < a) ? b : a. But we were rewriting (a < b) ? a : b to fmin_pseudo(a, b).

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah as some extra background on this @afonso360 emailed security@bytecodealliance.org with this issue since it ran a risk of affecting wasm which we'd typically consider a security vulnerability (e.g. executing guests incorrectly). I discussed this this morning with @fitzgen and @sunfishcode and we talked about this issue in depth and concluded that this doesn't actually affect wasm (as @afonso360 initially concluded but sought confirmation of too).

In investigating this though we reached the same conclusion as you @jameysharp which is that we could fix the rule by swapping the operands. We were, however, surprised by the definition of f{min,max}_pseudo in CLIF (which inherit their definitions from the simd proposal for wasm, which we found equally odd). Our conclusion was that we should probably just remove these rules for now and furthermore delete the f{min,max}_pseudo instructions in favor of their expanded forms using select and fcmp. The x86 backend could then pattern match the two instructions together to achieve the desired instructions on x86 (which appeared to be the original motivation for adding them to the spec anyway)

So if you agree then that seems like the least-risky way to move forward, effectively removing the confusion around operand orderings of f{min,max}_pseudo, and instead having that subtelty only in the wasm->CLIF translation as well as the x86 backend.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Aug 18, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@jameysharp
Copy link
Contributor

Okay, after thinking this through, I agree: let's remove the fmin_pseudo and fmax_pseudo instructions, and we can merge this PR as a step in that direction.

I've been thinking about what could justify keeping these instructions, given that the backends can match this pattern easily. The main case I can think of would be if they allowed us to concisely express other significant optimizations in the egraph rules. But that seems unlikely as these are rather specific. At least, I can't think of any possible examples.

@alexcrichton alexcrichton added this pull request to the merge queue Aug 21, 2023
Merged via the queue into bytecodealliance:main with commit 2728866 Aug 21, 2023
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Aug 21, 2023
This commit removes these two instructions and replaces them instead
with their equivalents using `fcmp` plus `select` or `bitselect`
depending on the type (`bitselect` for vectors, `select` for scalars).
The motivation for this commit is that incorrect optimizations for these
instructions were removed in bytecodealliance#6859 and likely stemmed from the
surprising definitions of these instructions. These originally were
intended to correspond to operations in the SIMD proposal for
WebAssembly but nowadays the functionality of these instructions is
replaced with:

* Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction.
* Backends that support optimizing this pattern use ISLE patterns to
  match the instruction and emit the specialization for the pseudo
  semantics.

This means that while the instructions are removed here it should be the
case that no functionality is lost and the output of Wasmtime/Cranelift
should still be the same as it was before. Existing tests using the
pseudo instructions were preserved except the riscv64 ones (where the
lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64
continue to have specialized patterns for this compare-plus-select.
github-merge-queue bot pushed a commit that referenced this pull request Aug 21, 2023
This commit removes these two instructions and replaces them instead
with their equivalents using `fcmp` plus `select` or `bitselect`
depending on the type (`bitselect` for vectors, `select` for scalars).
The motivation for this commit is that incorrect optimizations for these
instructions were removed in #6859 and likely stemmed from the
surprising definitions of these instructions. These originally were
intended to correspond to operations in the SIMD proposal for
WebAssembly but nowadays the functionality of these instructions is
replaced with:

* Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction.
* Backends that support optimizing this pattern use ISLE patterns to
  match the instruction and emit the specialization for the pseudo
  semantics.

This means that while the instructions are removed here it should be the
case that no functionality is lost and the output of Wasmtime/Cranelift
should still be the same as it was before. Existing tests using the
pseudo instructions were preserved except the riscv64 ones (where the
lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64
continue to have specialized patterns for this compare-plus-select.
eduardomourar pushed a commit to eduardomourar/wasmtime that referenced this pull request Aug 22, 2023
…odealliance#6859)

This transform is wrong for the inputs in the attached testcase.

It was introduced in bytecodealliance#5546 and has been sort of unnoticed until bytecodealliance#6843 caused
some wasmtime testcode to fire this optimization rule and produce this
counterexample.
eduardomourar pushed a commit to eduardomourar/wasmtime that referenced this pull request Aug 22, 2023
…6874)

This commit removes these two instructions and replaces them instead
with their equivalents using `fcmp` plus `select` or `bitselect`
depending on the type (`bitselect` for vectors, `select` for scalars).
The motivation for this commit is that incorrect optimizations for these
instructions were removed in bytecodealliance#6859 and likely stemmed from the
surprising definitions of these instructions. These originally were
intended to correspond to operations in the SIMD proposal for
WebAssembly but nowadays the functionality of these instructions is
replaced with:

* Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction.
* Backends that support optimizing this pattern use ISLE patterns to
  match the instruction and emit the specialization for the pseudo
  semantics.

This means that while the instructions are removed here it should be the
case that no functionality is lost and the output of Wasmtime/Cranelift
should still be the same as it was before. Existing tests using the
pseudo instructions were preserved except the riscv64 ones (where the
lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64
continue to have specialized patterns for this compare-plus-select.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants