streamline/speed up addc/subc by using in/out carry parameter#267
streamline/speed up addc/subc by using in/out carry parameter#267greg7mdp wants to merge 15 commits intochfast:mainfrom
Conversation
|
I believe this version is significantly faster! |
|
I benchmarked GCC 11. In general we would need to split these into smaller changes (I can do this myself) for better analysis.
We should also consult Clang. It can optimize add/sub perfectly so as long it stays this way we can go ahead for changes making it easier for GCC. Details |
|
Thanks for looking into it @chfast ! I'm not sure how to read the numbers you posted, but I am surprised that you found that add/sub got slower. In my tests it was faster. I noticed that sometimes running the same code twice gave different numbers though! Benchmarks are hard. |
include/intx/intx.hpp
Outdated
| } | ||
| #endif | ||
|
|
||
| if (((x | y) & (uint64_t(1) << 63)) == 0) { |
There was a problem hiding this comment.
This is ineffective for compilers we care about. GCC and Clang should be using builtins.
There was a problem hiding this comment.
So if we use the builtin how do you explain that my version is slower in the benchmark?
There was a problem hiding this comment.
Actually I had removed this change. Are you looking at my latest version?
see https://github.com/greg7mdp/intx/blob/master/include/intx/intx.hpp
|
Also why diff between add and inline_add? |
This is comparison between master and your changes. Numbers are:
This comes from a tool comparing outputs of |
Because the benchmark run in a loop the "inline_add" can be vectorized. This is not relevant for EVM use case. I probably could review all benchmark cases and remove half of these. |
|
Thanks!
vectorized or inlined? Why shouldn't it be relevant for EVM? |
|
The main difference is that the |
|
@chfast if you are using gcc 11 and therefore the builtin, I am puzzled that my version would be slower. |
| return subc(x, y).carry; | ||
| for (size_t i = uint<N>::num_words; i-- > 1; ) { | ||
| if (x[i] != y[i]) | ||
| return x[i] < y[i]; |
There was a problem hiding this comment.
This single change looks interesting. Can you submit it as a separate PR for easier verification?
|
Kudos, SonarCloud Quality Gate passed! |








No description provided.