Skip to content

Use compression.zstd’s multithreading #180

@marcelm

Description

@marcelm

[https://docs.python.org/3/library/compression.zstd.html#compression.zstd.CompressionParameter.nb_workers](compression.zstd supports multithreading) for compression and we should use it.

You would have to write threads=0 to get it. This is how we designed it, and I am just wondering if this is still the best behavior. I’m not sure whether I would expect this when using the library. It is consistent with the other functions, though.

Second, compression.zstd seems to support multi-threaded decompression, and this isn’t implemented in this PR. But that can be done later.

Oh yes, in that case let's implement the multithreading like we did for python-isal and python-zlib-ng later. I have always felt that the intention of this library was to overcome compression bottlenecks (simply because these are unfortunately big in bioinformatics). External processes can't share memory and pipes have some efficiency cost, so if the possibility for threads and shared memory is present it is always best to go for that option. Also, it eliminates a conda dependency, which is a nice bonus.

Originally posted by @rhpvorderman in #178 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions