-
Notifications
You must be signed in to change notification settings - Fork 1.6k
egraphs: Delete select+fcmp to f{min,max}_pseudo transform
#6859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
egraphs: Delete select+fcmp to f{min,max}_pseudo transform
#6859
Conversation
This transform is wrong for the inputs in the attached testcase. It was introduced in bytecodealliance#5546 and has been sort of unnoticed until bytecodealliance#6843 caused some wasmtime testcode to fire this optimization rule and produce this counterexample.
|
Comparing this existing test in against the new test in this PR: I wouldn't think that NaN canonicalization could lead to us mixing up these results. Oh hey, I think the rule in question was only barely wrong. I think with the arguments swapped it might work: (rule (simplify
(select ty (fcmp _ (FloatCC.LessThan) x y) x y))
(fmin_pseudo ty y x))According to the Cranelift |
alexcrichton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah as some extra background on this @afonso360 emailed security@bytecodealliance.org with this issue since it ran a risk of affecting wasm which we'd typically consider a security vulnerability (e.g. executing guests incorrectly). I discussed this this morning with @fitzgen and @sunfishcode and we talked about this issue in depth and concluded that this doesn't actually affect wasm (as @afonso360 initially concluded but sought confirmation of too).
In investigating this though we reached the same conclusion as you @jameysharp which is that we could fix the rule by swapping the operands. We were, however, surprised by the definition of f{min,max}_pseudo in CLIF (which inherit their definitions from the simd proposal for wasm, which we found equally odd). Our conclusion was that we should probably just remove these rules for now and furthermore delete the f{min,max}_pseudo instructions in favor of their expanded forms using select and fcmp. The x86 backend could then pattern match the two instructions together to achieve the desired instructions on x86 (which appeared to be the original motivation for adding them to the spec anyway)
So if you agree then that seems like the least-risky way to move forward, effectively removing the confusion around operand orderings of f{min,max}_pseudo, and instead having that subtelty only in the wasm->CLIF translation as well as the x86 backend.
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "cranelift", "isle"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
Okay, after thinking this through, I agree: let's remove the I've been thinking about what could justify keeping these instructions, given that the backends can match this pattern easily. The main case I can think of would be if they allowed us to concisely express other significant optimizations in the egraph rules. But that seems unlikely as these are rather specific. At least, I can't think of any possible examples. |
This commit removes these two instructions and replaces them instead with their equivalents using `fcmp` plus `select` or `bitselect` depending on the type (`bitselect` for vectors, `select` for scalars). The motivation for this commit is that incorrect optimizations for these instructions were removed in bytecodealliance#6859 and likely stemmed from the surprising definitions of these instructions. These originally were intended to correspond to operations in the SIMD proposal for WebAssembly but nowadays the functionality of these instructions is replaced with: * Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction. * Backends that support optimizing this pattern use ISLE patterns to match the instruction and emit the specialization for the pseudo semantics. This means that while the instructions are removed here it should be the case that no functionality is lost and the output of Wasmtime/Cranelift should still be the same as it was before. Existing tests using the pseudo instructions were preserved except the riscv64 ones (where the lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64 continue to have specialized patterns for this compare-plus-select.
This commit removes these two instructions and replaces them instead with their equivalents using `fcmp` plus `select` or `bitselect` depending on the type (`bitselect` for vectors, `select` for scalars). The motivation for this commit is that incorrect optimizations for these instructions were removed in #6859 and likely stemmed from the surprising definitions of these instructions. These originally were intended to correspond to operations in the SIMD proposal for WebAssembly but nowadays the functionality of these instructions is replaced with: * Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction. * Backends that support optimizing this pattern use ISLE patterns to match the instruction and emit the specialization for the pseudo semantics. This means that while the instructions are removed here it should be the case that no functionality is lost and the output of Wasmtime/Cranelift should still be the same as it was before. Existing tests using the pseudo instructions were preserved except the riscv64 ones (where the lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64 continue to have specialized patterns for this compare-plus-select.
…odealliance#6859) This transform is wrong for the inputs in the attached testcase. It was introduced in bytecodealliance#5546 and has been sort of unnoticed until bytecodealliance#6843 caused some wasmtime testcode to fire this optimization rule and produce this counterexample.
…6874) This commit removes these two instructions and replaces them instead with their equivalents using `fcmp` plus `select` or `bitselect` depending on the type (`bitselect` for vectors, `select` for scalars). The motivation for this commit is that incorrect optimizations for these instructions were removed in bytecodealliance#6859 and likely stemmed from the surprising definitions of these instructions. These originally were intended to correspond to operations in the SIMD proposal for WebAssembly but nowadays the functionality of these instructions is replaced with: * Lowering from wasm to clif uses the `fcmp` plus `select` combo instruction. * Backends that support optimizing this pattern use ISLE patterns to match the instruction and emit the specialization for the pseudo semantics. This means that while the instructions are removed here it should be the case that no functionality is lost and the output of Wasmtime/Cranelift should still be the same as it was before. Existing tests using the pseudo instructions were preserved except the riscv64 ones (where the lowering was deleted) and the dynamic AArch64 ones. Both s390x and x64 continue to have specialized patterns for this compare-plus-select.
👋 Hey,
This transform is wrong for the inputs in the attached testcase. Both of these instructions differ with their NaN behaviours.
It was introduced in #5546 and has been sort of unnoticed until #6843 caused some wasmtime testcode to fire this optimization rule and produce this counterexample.
@jameysharp has also a nice example of how the testcode now fires this rule (with #6843) where it hadn't before.
I have no idea how fuzzgen hasn't found this yet! It's a really simple input so it should have been found pretty much instantly.
I've been trying to look at why over the past week but haven't come to any conclusions yet. Right now I'm trying to look into the nan canonicalization pass and trying to check if it would mask this behavior.
CC: @alexcrichton