Skip to content

[Feature Request] Implement optimizations from OneSweep #5

@natevm

Description

@natevm

Not sure this is the right place for feature requests, but I’m curious if anyone here has considered moving to the “OneSweep” sort used in CUB.

Link to arxiv paper below:
https://arxiv.org/abs/2206.01784

In theory, this would be much faster than the four way binned historgram approach used by FFX, since OneSweep sorts 8 bits at a time and with 2n global memory read/write operations over four binning iterations over 32 bit keys rather than then the 3n read/read/write operations in the four way radix over 8 iterations.

The method requires a forward progress guarantee, but iiuc RDNA supports this now (at least if I understand the RDNA white paper correctly)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions