Skip to content

Conversation

@jackkleeman
Copy link
Contributor

With some help from the quickdiv library, I have found the multiply and shift that is equivalent to a u128 division by 62^10. We currently do that division twice and its super super slow (__udivti3). For u64 the compiler already turns the divisions into a multiply and shift which is why the use of mostly-u64-divisions in this library is so effective, but this is not the case for u128 apparently.

encode/standard_new_fixed
                        time:   [32.628 ns 32.801 ns 33.085 ns]
                        change: [-41.146% -39.914% -39.003%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
encode/standard_new_random
                        time:   [30.778 ns 31.021 ns 31.250 ns]
                        change: [-46.061% -45.407% -44.802%] (p = 0.00 < 0.05)
                        Performance has improved.
encode/standard_bytes_fixed
                        time:   [21.184 ns 21.255 ns 21.352 ns]
                        change: [-50.594% -49.762% -48.248%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
encode/standard_bytes_random
                        time:   [23.588 ns 23.664 ns 23.744 ns]
                        change: [-49.891% -49.451% -48.838%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
encode/standard_buf_fixed
                        time:   [36.390 ns 36.488 ns 36.590 ns]
                        change: [-35.510% -35.272% -34.976%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
encode/standard_buf_random
                        time:   [23.543 ns 23.669 ns 23.813 ns]
                        change: [-48.243% -47.933% -47.579%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
encode/alternative_new_fixed
                        time:   [32.984 ns 33.088 ns 33.180 ns]
                        change: [-39.498% -39.270% -39.035%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
encode/alternative_new_random
                        time:   [30.405 ns 30.618 ns 30.859 ns]
                        change: [-45.348% -44.161% -42.669%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
encode/alternative_bytes_fixed
                        time:   [21.306 ns 21.358 ns 21.419 ns]
                        change: [-50.234% -50.088% -49.940%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
encode/alternative_bytes_random
                        time:   [24.073 ns 24.297 ns 24.546 ns]
                        change: [-49.165% -48.685% -48.177%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
encode/alternative_buf_fixed
                        time:   [36.533 ns 36.621 ns 36.718 ns]
                        change: [-35.979% -35.560% -35.046%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
encode/alternative_buf_random
                        time:   [23.507 ns 23.579 ns 23.658 ns]
                        change: [-49.875% -49.461% -49.169%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

```
encode/standard_new_fixed
                        time:   [32.628 ns 32.801 ns 33.085 ns]
                        change: [-41.146% -39.914% -39.003%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
encode/standard_new_random
                        time:   [30.778 ns 31.021 ns 31.250 ns]
                        change: [-46.061% -45.407% -44.802%] (p = 0.00 < 0.05)
                        Performance has improved.
encode/standard_bytes_fixed
                        time:   [21.184 ns 21.255 ns 21.352 ns]
                        change: [-50.594% -49.762% -48.248%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
encode/standard_bytes_random
                        time:   [23.588 ns 23.664 ns 23.744 ns]
                        change: [-49.891% -49.451% -48.838%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
encode/standard_buf_fixed
                        time:   [36.390 ns 36.488 ns 36.590 ns]
                        change: [-35.510% -35.272% -34.976%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
encode/standard_buf_random
                        time:   [23.543 ns 23.669 ns 23.813 ns]
                        change: [-48.243% -47.933% -47.579%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
encode/alternative_new_fixed
                        time:   [32.984 ns 33.088 ns 33.180 ns]
                        change: [-39.498% -39.270% -39.035%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
encode/alternative_new_random
                        time:   [30.405 ns 30.618 ns 30.859 ns]
                        change: [-45.348% -44.161% -42.669%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
encode/alternative_bytes_fixed
                        time:   [21.306 ns 21.358 ns 21.419 ns]
                        change: [-50.234% -50.088% -49.940%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
encode/alternative_bytes_random
                        time:   [24.073 ns 24.297 ns 24.546 ns]
                        change: [-49.165% -48.685% -48.177%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
encode/alternative_buf_fixed
                        time:   [36.533 ns 36.621 ns 36.718 ns]
                        change: [-35.979% -35.560% -35.046%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
encode/alternative_buf_random
                        time:   [23.507 ns 23.579 ns 23.658 ns]
                        change: [-49.875% -49.461% -49.169%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
```
@fbernier fbernier merged commit b06529a into fbernier:master Aug 26, 2025
8 checks passed
@jackkleeman jackkleeman deleted the strength-reduction branch August 26, 2025 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants