Skip to content

Stop exposing "shift by vector" via operator overloading? #163

@valadaptive

Description

@valadaptive

The "shift left by vector" and "shift right by vector" operations don't lower to hardware operations on many platforms. On WebAssembly and SSE4.2, they all fall back to scalar operations. On AVX2, they're only vectorized for 32-bit and 64-bit operands.

It seems like a bit of a footgun to expose such operations in a way that looks identical to the much faster "shift by scalar". For example, clatter uses a vectorized shift as part of an RNG, and switching to a different algorithm that uses a non-vectorized shift is faster in practice.

It also makes it easier to accidentally use a vectorized shift when a scalar shift would do. You'd probably expect the two code snippets below to produce identical code:

let x: u32x4<S> = y >> 5;
let x: u32x4<S> = y >> u32x4::splat(simd, 5);

All the other operations take an impl SimdInto<Self, S> as the right-hand side and just call splat internally, so it's reasonable to assume that's what happens for the shifts as well. In this case, however, the latter snippet is slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions