Skip to content

Background worker no longer alive #138

@sebquetin

Description

@sebquetin

Hello there,

I have been using nnU-Net for a while and I think it is a great software so thank you very much for your work.
I have been having trouble running trainings on clusters due to background workers dying with nnU-Net v2.5.2 and batchgenerators 0.25.1 and I fixed it but I would like to discuss my fix and see if it should be adopted or not.
I was running on AMD EPYC 7413 (Zen 3) CPU and a NVidia A100SXM4 GPU on a Linux Centos 7 machine. I would constantly get the "One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message" error from the NonDetMultiThreadedAugmenter class. This was caused by the two successive "with threadpool_limits(limits=1, user_api=None):" statements. I removed the "with threadpool_limits" from the _start() function and it solved my problem. So my question is, is it necessary to have a "with threadpool_limits" statement in the _start() class function since we already have it in the producer() function?

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions