Skip to content

Investigate further performance optimizations in Repartition #5401

@1RyanK

Description

@1RyanK

In #5090 I got pretty good performance out of the concatenate_uniquely function for Strings, but it still doesn't beat the existing pathway:

------------------------------------------------------------------------- benchmark 'AK_strings_concat_unique': 8 tests --------------------------------------------------------------------------
Name (time in s)                           Min               Max              Mean            StdDev            Median               IQR            Outliers     OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_strings_concat_unique[1-0.75]     2.1740 (1.0)      2.2838 (1.0)      2.2146 (1.0)      0.0473 (2.29)     2.1960 (1.0)      0.0762 (3.53)          1;0  0.4516 (1.0)           5           1
bench_strings_concat_unique[1-0.5]      2.3490 (1.08)     2.4030 (1.05)     2.3686 (1.07)     0.0207 (1.0)      2.3661 (1.08)     0.0216 (1.0)           1;0  0.4222 (0.93)          5           1
bench_strings_concat_unique[2-0.5]      2.4689 (1.14)     3.1955 (1.40)     2.8140 (1.27)     0.2964 (14.33)    2.7309 (1.24)     0.4795 (22.21)         2;0  0.3554 (0.79)          5           1
bench_strings_concat_unique[1-0.25]     2.5054 (1.15)     2.5831 (1.13)     2.5495 (1.15)     0.0307 (1.48)     2.5545 (1.16)     0.0464 (2.15)          2;0  0.3922 (0.87)          5           1
bench_strings_concat_unique[2-0.75]     2.5337 (1.17)     3.3380 (1.46)     2.9655 (1.34)     0.3200 (15.47)    3.0040 (1.37)     0.5063 (23.45)         2;0  0.3372 (0.75)          5           1
bench_strings_concat_unique[2-0.25]     2.5538 (1.17)     2.9718 (1.30)     2.7417 (1.24)     0.1828 (8.84)     2.6550 (1.21)     0.3081 (14.27)         2;0  0.3647 (0.81)          5           1
bench_strings_concat_unique[1-0.0]      2.5973 (1.19)     4.2031 (1.84)     3.0300 (1.37)     0.6656 (32.18)    2.7663 (1.26)     0.5717 (26.48)         1;1  0.3300 (0.73)          5           1
bench_strings_concat_unique[2-0.0]      2.8344 (1.30)     3.2725 (1.43)     2.9917 (1.35)     0.1685 (8.15)     2.9632 (1.35)     0.1819 (8.42)          1;0  0.3343 (0.74)          5           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This is on 100m strings (per locale, I think). But performance flips on lower amounts of strings - for 1m and 10m, concatenate_uniquely is better. Some things worth investigating - why is unique better for larger number of strings? Is there anything that can be done in Repartition to improve performance? Some ideas to get you started:
Repartition may do better on larger strings (meaning more than 16 bytes)
Maybe try something with cptrs in repartitionByLocaleStringArray to eliminate the for over the locales.
Try doing a local dedupe before repartitionByLocaleStringArray (possibly the same way as the final deduplication) or allowing the user to opt into that.
Get rid of myBytes[currByte..#numBytes] = valsA[firstStart..#numBytes]; (this is bulk transfer of bytes from the SegString into a local innerArray). Even if the string sizes are uniform and this copy is entirely local, it's still possibly a large copy. Instead, try to hash and repartition based on segStrings[i].values.a that are on the current locale already. Of course some strings will overlap the boundaries, but balancedSegStringRetrieval may give you some ideas on how to achieve this. This may introduce some overhead but at a large scale it could save a lot of time.
Maybe try a different sort - in dedupeLocalByHashSort, I just used the standard sort. unique uses a different sort. Try playing around with this and seeing what you get.

Once you're getting good performance here, take the strategies in Repartition and apply them to other functions!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions