Work Queue: Hardware-level isolation between tasks? #4370
Replies: 10 comments
-
|
Thanks for getting in touch to talk it over. The general assumptions of WQ are this:
So I think we can break this down into two distinct problems:
TaskVine does have a number of advantages over WQ. But in this aspect of resource assignment, it works the same. So let's just address the problem here in WQ, and then we can port the same solution over to TaskVine. Based on your discussion so far, it seems to me the problem is #1 -- WQ is telling each task how many cores to use, but not which ones. If we could figure out some improvement using (for example) |
Beta Was this translation helpful? Give feedback.
-
|
Hi! I think your assessment is correct, and whether tasks respect their 'resource allocation' should be up to the user -- not WQ. Ideally, a task could find which resources WQ reserved for it (in its environment), and from there restrict itself to those resources in some way (taskset, cgroups, what have you). I do not know how the WQ worker manages the tasks it schedules. Does it internally assign e.g., cores to a task? Or does it simply keep track of the total 'cores in use' for all running tasks? In the first case, it might be possible to pipe this reservation to the task quite straightforwardly. I'm afraid As mentioned in my first post, we usually run in a SLURM environment, and then Small disclaimer, this really is not familiar territory for me. I might be complicating things unnecessarily. Feel free to think outside my poorly informed box. |
Beta Was this translation helpful? Give feedback.
-
|
(FYI, I'm moving this over to GitHub discussions; will start an issue once we have a firm idea of what's needed.) |
Beta Was this translation helpful? Give feedback.
-
|
At the moment, each worker is assigned a fungible number of cores ( But as you have pointed out, the mechanisms for this seem to vary a lot across operating systems, batch systems, sites, etc. But for your case, 1- Modify the application to insert the preamble in the place where it defines tasks. (Although this might be quite difficult given the whole software stack.) I'm actually leaning towards 3, because the worker (and factory) may already know that they are running in slurm, whereas the application may be agnostic. What do you think? |
Beta Was this translation helpful? Give feedback.
-
|
Generally speaking, our tasks (solving Schrödinger, running molecular dynamics) are long-running processes. Overhead should never be a concern (for us). If it is not a complex change to insert which resources (cores, GPUs) WQ assigns to a task into the task environment, then that already seems like a solid solution. From there, users could still choose to enforce/ignore this assignment in their task definition. I think this approach corresponds to your suggestion 1 and should be possible within our framework of Parsl A possible caveat: the optimal subset of resources for a task (e.g., which 8 out of 32 cores maximise cache locality, minimise memory bandwidth congestion, etc.) is difficult to specify without detailed knowledge of the CPU infrastructure. The WQ worker should probably not concern itself with such low-level optimisations - that feels like scheduler stuff. Regarding SLURM, the Overall, I think this should be opt-in behaviour. I quickly checked the |
Beta Was this translation helpful? Give feedback.
-
|
(Gentle bump) @dthain, what is your opinion on this? Perhaps we should ask the Parsl devs for their views too? |
Beta Was this translation helpful? Give feedback.
-
|
My apologies, I mistaken thought you had reached a conclusion. Given that this is opt-in, site specific behavior, then we should rely on a general command insertion rather than building up a whole now capability. And the Parsl In the meantime, I was poking at the capability to insert a command from the worker side of things, to handle cases where the person deploying is not necessarily the same as the application author. This PR implements the ability to run I think the |
Beta Was this translation helpful? Give feedback.
-
|
I would prefer for this functionality not to hinge upon SLURM being available (because it will not always be). If WQ could expose environment variables like similar to how it works for GPUs, then I imagine a simple wrapper already goes a long way to fixing the problem. tagging @benclifford for his view on how we could insert this functionality into the |
Beta Was this translation helpful? Give feedback.
-
|
We can definitely add the capability to expose specific core assignments to tasks, and I will add that to the short-term queue here. I think there are so many potential configurations that I am reluctant to hard-code how those assignments are enforced, at least yet. (I'm thinking about different batch systems, containers, topologies, nested workers, omp, mkl, etc. etc.) If you are willing to deploy the |
Beta Was this translation helpful? Give feedback.
-
|
Great! That would really help us out. I agree that a one-size-fits-all solution is unlikely given the variety of execution environments. To me, it's preferable for users to explicitly specify how they enforce resource assignments - i.e., through some transparent plugin/wrapper setup - rather than it being hidden away in complex code wizardry. I will definitely try it out for our setup and report back to you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
TL;DR
Can we isolate computational resources to individual tasks managed by a single worker in WorkQueue?
We are using Parsl with WQ to build molecular modelling workflows (psiflow). The
WorkQueueExecutoris convenient because it can schedule differently sized tasks together in a single resource block (i.e., a HPC job allocation). However, it does not force those tasks to use different resources (#3886), leading to thread contention issues or potentially more severe oversubscription problems. WQ does set several environment variables (OMP_NUM_CORES, ...), but those do not specify 'allocated cores' or similar.Some possible ideas:
srunwrappers. Obviously, that will only work in SLURM environments and becomes more complex when running in containers.Any insight would be greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions