-
Notifications
You must be signed in to change notification settings - Fork 24
Improves pipe cleanup for terminated scheduler jobs #868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
77c5e5f
c0c54c2
0bb42c3
34b0f27
985bb6d
7aaa9f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -309,6 +309,18 @@ def test_unsupported_family_fails(self): | |
| final = read_task_state(self.tmpdir, 'bad_family') | ||
| self.assertIn(final.status, ('FAILED_RETRYABLE', 'FAILED_TERMINAL')) | ||
|
|
||
| def test_scratch_creation_failure_marks_failed(self): | ||
| """If tempfile.mkdtemp fails (e.g., I/O error), the task is properly | ||
| marked FAILED_RETRYABLE instead of being left stuck in RUNNING.""" | ||
| spec = _make_h2o_spec('io_fail') | ||
|
Comment on lines
+312
to
+315
|
||
| initialize_task(self.tmpdir, spec) | ||
| state, token = self._claim('io_fail') | ||
| with patch('arc.scripts.pipe_worker.tempfile.mkdtemp', | ||
| side_effect=OSError(5, 'Input/output error')): | ||
| run_task(self.tmpdir, 'io_fail', state, 'test-worker', token) | ||
| final = read_task_state(self.tmpdir, 'io_fail') | ||
| self.assertIn(final.status, ('FAILED_RETRYABLE', 'FAILED_TERMINAL')) | ||
|
|
||
|
|
||
| class TestWorkerLoop(unittest.TestCase): | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a deterministic ESS error is detected on the exception path, this block overwrites
result['failure_class']to'ess_error', but the laterupdate_task_state(..., failure_class=...)call still persists the original Python exception type. This makesstate.jsoninconsistent withresult.jsonand can mislead downstream diagnostics. Use the same failure_class value for the state update as the one written intoresult.json(e.g.,result['failure_class']).