Skip to content

what is the role of workers with locally_exlusive_torch or locally_exclusive_cl? #4273

@ryan-budde

Description

@ryan-budde

I'm getting back in to motion correction, and I am testing out some new options.

I ran a test on 10 minutes of NPX 1.0 data. I used the locally_exclusive peak detection and completed the run in 212 seconds (timed using python's perf_counter) with 12 workers using a 28-logical processors ~3.7 GHz CPU (2024 i7). I then tested locally_exclusive_torch and completed the run in 216 seconds using the same settings on a 4000 Ada GPU. My task manager shows essentially 100% utilization of either CPU or GPU during these runs.

I noticed on the progress bar that the torch version also said it was using 12 workers.

What is the meaning of workers when we run something on the GPU? My understanding is that this is not typical language for the parallel processing on the GPU, and that typically we just let cuda run its magic "beneath the hood"

Would you recommend changing the number of workers when running these steps on GPU? Or does that not have a meaningful impact? If a process is parallelizable I would expect more of a performance improvement moving from CPU to GPU. Do you have any ideas if the performance I observed is expected?

Metadata

Metadata

Assignees

No one assigned

    Labels

    concurrencyRelated to parallel processingquestionGeneral question regarding SI

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions