what is the role of workers with locally_exlusive_torch or locally_exclusive_cl?

I'm getting back in to motion correction, and I am testing out some new options.

I ran a test on 10 minutes of NPX 1.0 data. I used the locally_exclusive peak detection and completed the run in 212 seconds (timed using python's perf_counter) with 12 workers using a 28-logical processors ~3.7 GHz CPU (2024 i7). I then tested locally_exclusive_torch and completed the run in 216 seconds using the same settings on a 4000 Ada GPU. My task manager shows essentially 100% utilization of either CPU or GPU during these runs.

I noticed on the progress bar that the torch version also said it was using 12 workers.

What is the meaning of workers when we run something on the GPU? My understanding is that this is not typical language for the parallel processing on the GPU, and that typically we just let cuda run its magic "beneath the hood"

Would you recommend changing the number of workers when running these steps on GPU? Or does that not have a meaningful impact? If a process is parallelizable I would expect more of a performance improvement moving from CPU to GPU. Do you have any ideas if the performance I observed is expected?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

what is the role of workers with locally_exlusive_torch or locally_exclusive_cl? #4273

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

what is the role of workers with locally_exlusive_torch or locally_exclusive_cl? #4273

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions