fix: prevent Linux gateway mis-kill and calibration data resurrection#330
Closed
cursor[bot] wants to merge 1 commit into
Closed
fix: prevent Linux gateway mis-kill and calibration data resurrection#330cursor[bot] wants to merge 1 commit into
cursor[bot] wants to merge 1 commit into
Conversation
Linux cleanup_zombie_gateway_processes previously used fuser + kill -9 on every PID bound to the gateway port, with no cmdline or /health checks. This could terminate unrelated services or healthy Gateways during restart timeouts. Align Linux with the Windows strategy: verify openclaw gateway cmdline, retry /health, only kill unresponsive zombies, and adopt healthy external instances. Calibration inherit mode preferred the richer openclaw.json.bak over the current file. Because every write copies the previous config to .bak, intentional removals (providers/channels) could be resurrected on the next calibration. Prefer current whenever it is non-empty; only fall back to backup when current is effectively empty. Add regression tests for calibration source selection and mirror the fix in dev-api.js. Co-authored-by: 晴天 <1186258278@users.noreply.github.com>
Contributor
|
已在 main 通过我们自己的提交吸收并发布:0a65ea7 / v0.18.5。\n\n覆盖内容:\n- Linux Gateway 清理先校验进程命令行和 /health,避免 fuser 盲杀\n- 校准继承优先使用当前非空配置,避免 .bak 恢复用户主动删除的配置\n- Web/Rust 两侧都补了回归测试\n\nCI 与 release workflow 均已通过,发布页:https://github.com/qingchencloud/clawpanel/releases/tag/v0.18.5\n\n这个 draft PR 保留会造成重复,先关闭。 |
Contributor
|
已由 main 的 0a65ea7 / v0.18.5 吸收并发布,关闭重复 draft PR。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Daily critical-bug scan found two high-severity correctness issues and fixes them with minimal, targeted changes.
Bug 1: Linux gateway zombie cleanup kills unrelated processes
Impact: On Linux,
cleanup_zombie_gateway_processesusedfuser+kill -9on every PID bound to the gateway port, with no command-line or/healthchecks. During gateway restart timeouts or stop failures, this could terminate unrelated services sharing the port, or kill a healthy Gateway that was still starting up.Root cause: Linux implementation was added as a blunt port-based kill, while Windows was later hardened (#244, #446) with cmdline verification,
/healthretries, and adoption of healthy external instances — Linux was never updated.Fix: Port the Windows safety strategy to Linux: find listeners via
ss/lsof, verifyopenclaw gatewaycmdline from/proc, retry/healthbefore killing, and adopt healthy external PIDs.Bug 2: Calibration inherit resurrects deleted config from
.bakImpact: Running inherit-mode calibration after intentionally removing providers/channels could restore deleted settings, silently undoing user edits.
Root cause:
select_calibration_sourcepreferred whichever ofopenclaw.jsonoropenclaw.json.bakhad a higher "richness" score. Everywrite_openclaw_configcopies the previous file to.bak, so after a slimming edit the backup is always richer than current.Fix: Prefer current whenever it is non-empty; only fall back to backup when current is effectively empty (score 0). Mirrored in
scripts/dev-api.js.Validation
select_calibration_source(richer-backup vs empty-current cases)cargo testnot run in this environment (missing GTK system deps); logic mirrors proven Windows gateway cleanup pathFiles changed
src-tauri/src/commands/service.rssrc-tauri/src/commands/config.rsscripts/dev-api.js