Conversation
|
|
||
| .. code-block:: console | ||
|
|
||
| $ flux submit --wait -n1 bash -c "sleep 30; /bin/false" |
| $ echo $? | ||
| 1 | ||
|
|
||
| The above command submits a job that simply sleeps for 30 seconds on one processor (``-n1``) and then runs ``/bin/false``. The :ref:`jobid <fluid>` is immediately output, but the command won't return until the 30 second job has completed. |
There was a problem hiding this comment.
| The above command submits a job that simply sleeps for 30 seconds on one processor (``-n1``) and then runs ``/bin/false``. The :ref:`jobid <fluid>` is immediately output, but the command won't return until the 30 second job has completed. | |
| The above command submits a job that simply sleeps for 30 seconds on one processor (``-n1``) and then runs ``/bin/false``. The :ref:`jobid <fluxid>` is immediately output, but the command won't return until the 30 second job has completed. |
There was a problem hiding this comment.
did you typo something here? When I grep I don't see a reference to "fluxid".
There was a problem hiding this comment.
I assumed "fluid" should be fluxid, but if "fluid" is correct my mistake!
jobs/waiting-for-jobs.rst
Outdated
|
|
||
| The above command submits a job that simply sleeps for 30 seconds on one processor (``-n1``) and then runs ``/bin/false``. The :ref:`jobid <fluid>` is immediately output, but the command won't return until the 30 second job has completed. | ||
|
|
||
| After the command has finished we print the exit code from ``flux submit``. You'll notice the exit code is ``1``, which is the final exit code of the job, which in this case was ``1`` because we ran ``/bin/false``. |
There was a problem hiding this comment.
| After the command has finished we print the exit code from ``flux submit``. You'll notice the exit code is ``1``, which is the final exit code of the job, which in this case was ``1`` because we ran ``/bin/false``. | |
| After the command has finished we print the exit code from ``flux submit``, which is ``1``, because we ran ``/bin/false``. |
jobs/waiting-for-jobs.rst
Outdated
| Flux Job Status | ||
| --------------- | ||
|
|
||
| In most cases, you do not want to sit and wait for the current job submission to complete. You would like to do other things, such as submit more jobs, and then wait for those specific jobs to complete. |
There was a problem hiding this comment.
Indeed I don't! I have avocados to eat! Mountains to climb!
jobs/waiting-for-jobs.rst
Outdated
|
|
||
| In most cases, you do not want to sit and wait for the current job submission to complete. You would like to do other things, such as submit more jobs, and then wait for those specific jobs to complete. | ||
|
|
||
| The ``flux job status`` command is the most basic way to wait for a specific job, based on jobid, to complete. Pass it one or more jobids to wait on, and ``flux job status`` will return once all of the jobs have completed. It will exit with largest exit code from any of the jobids specified. If the job(s) have already completed, ``flux job status`` returns immediately. It can be run as many times as the user would like against the same jobid. |
There was a problem hiding this comment.
Is the context here that I've submit a bunch, and then (after that) I want to wait for a specific job?
There was a problem hiding this comment.
Yes, think I should mention something to that affect?
There was a problem hiding this comment.
Yes exactly - you read between the lines.
| $ flux job wait | ||
| flux-job: there are no more waitable jobs | ||
|
|
||
| In this above example, we submit three jobs, sleeping for 60, 45, and 30 seconds respectively before running ``/bin/true``. We then run ``flux job wait`` without any inputs. You'll notice the jobids for the ``sleep 30`` job, then ``sleep 45`` job, then ``sleep 60`` job are returned in that order. Finally, without any jobs left running with the ``waitable`` flag, ``flux job wait`` indicates there are no more waitable jobs. |
There was a problem hiding this comment.
So it doesn't wait for all of them to complete (like the multiple one on the same line?) What is the use case for this if I have to run it a gazillion times?
There was a problem hiding this comment.
I believe the typical use case is a user wants to know when a job has finished and can do some type of post-processing on its results while the other jobs keep on running. They don't care which one finishes first/next, they just need to know that one has finished (and which one).
(Hopefully this use case might explain other questions you had above/below).
There was a problem hiding this comment.
probably good to stick one sentence in there to note this common use case.
There was a problem hiding this comment.
So you couldn't use flux job status for that?
There was a problem hiding this comment.
flux job status requires you to input all of the jobids and doesn't exit until all of the jobs finish, thus more inconvenient.
| ƒ4YPufmCjq | ||
| $ flux submit --flags waitable -n1 bash -c "sleep 30; /bin/false" | ||
| ƒ4YSVQWfZq | ||
| $ flux job wait --all --verbose |
There was a problem hiding this comment.
ohh this one makes sense! But what is the use case for without --all?
jobs/waiting-for-jobs.rst
Outdated
|
|
||
| This example is similar to the above, except one of the jobs runs ``/bin/false`` instead of ``/bin/true``. When ``flux job wait --all`` is executed, you'll notice a message output indicating that one job has failed (the one that ran ``/bin/false``). And similar to ``flux job status``, the exit code of ``1`` is returned due to the highest exit code of all the jobs. | ||
|
|
||
| The biggest disadvantage of ``flux job wait`` compared to ``flux job status`` is that jobs can only waited on once. |
There was a problem hiding this comment.
| The biggest disadvantage of ``flux job wait`` compared to ``flux job status`` is that jobs can only waited on once. | |
| The biggest disadvantage of ``flux job wait`` compared to ``flux job status`` is that jobs can only be waited on once. |
There was a problem hiding this comment.
Only being able to wait on a job once is not necessarily a disadvantage, without it you would not be able to flux job wait in a loop (you'd just keep getting the same jobid continually). So there is a purpose here and each interface satisfies different se cases. Instead of calling this a disadvantage, maybe the guide should discuss the use cases for which each interface is designed?
jobs/waiting-for-jobs.rst
Outdated
|
|
||
| $ flux submit --flags waitable -n1 bash -c "sleep 30; /bin/true" | ||
| ƒBbk3qrdro | ||
| $ flux job wait ƒBbk3qrdro |
There was a problem hiding this comment.
Why would I put the jobid at all? Wouldn't I just run flux job wait without any args like shown in the example above?
There was a problem hiding this comment.
ahh you're correct for this specific case, they wouldn't need to. Would it be clearer to not put in the jobid in this case? (Edit: i see your comment below, probably should remove it)
There was a problem hiding this comment.
what if there was a previously submitted waitable job that was not yet reaped? flux job wait doesn't necessarily only wait for the last submitted job...
| Pros: | ||
|
|
||
| - ``flux job wait`` more efficient when waiting for a set of jobs | ||
| - Jobids do not need to be specified to ``flux job wait`` |
There was a problem hiding this comment.
So maybe just take that part of the tutorial out - don't show giving a job id to flux job wait if that shouldn't be learned.
There was a problem hiding this comment.
I see your point. I'll mention it, but definitely stress it less.
|
pushed a fixup, tweaking a few things, adding a sentence here and there given comments above |
|
Perhaps something should be said in here to the effect of: If you need to wait for thousands of jobs efficiently, or need to wait for single jobs as they complete, then |
|
re-pushed. taking into account several of the comments above, re-worked the flow of the |
|
re-pushed, updating example script given completion of flux-framework/flux-core#5033 |
Add a new guide on how to wait for jobs to complete.
|
this should be updated to include |
Add a new guide on how to wait for jobs to complete.