fix(daemon): bind socket atomically to survive a concurrent-start herd#94
Merged
Conversation
Contributor
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
a6b64d0 to
f576eda
Compare
d993f9b to
3338a70
Compare
f576eda to
be99a66
Compare
The herd guard probed then unlinked the socket before listen(), leaving a race: two daemons that both saw no socket reached listen() together and the loser crashed with EEXIST/EADDRINUSE (exit 1) instead of yielding. Seven such crashes appeared in one local log; a herd of 12 concurrent starts reproduces it ~2 of 3 runs. Listen first; on EADDRINUSE/EEXIST re-probe the socket and yield (exit 0) if a live daemon owns it, reclaiming only a confirmed-stale inode. A late starter can no longer unlink the winner's live socket, which previously split the teamMembers map across two daemons and broke cross-session nesting. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
'yield' reads as the JS keyword / 'yield the event loop' in a .ts file; the loser actually process.exit(0)s. Say so plainly, and note the loser's hook event still reaches the winning daemon over the socket. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
69ec875 to
e473d7d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
The herd guard probed then unlinked the socket before
listen(), leaving a race: two daemons that both saw no socket reachedlisten()together and the loser crashed with EEXIST/EADDRINUSE instead of yielding (7 such crashes locally).Listen first; on EADDRINUSE/EEXIST re-probe the socket and yield if a live daemon owns it, reclaiming only a confirmed-stale inode. A late starter can no longer unlink the winner's live socket and split the team map.
Test plan
node --import tsx --test tests/daemon-startup-race.test.ts(herd of 12 concurrent starts: zero crashes, exactly one listener)