Skip to content

kg-fuse: harden service management — busy-mount recovery + version skew from stale unit path #452

@aaronsb

Description

@aaronsb

Summary

Two related rough edges in kg-fuse service/mount management, both hit while upgrading the driver 0.12.0 → 0.12.1 during v0.14.0 release verification. Neither is data-affecting, but both leave the user with a broken or silently-stale mount and require manual recovery.

1. reset doesn't self-heal a busy mount → strands a "transport endpoint not connected" orphan

Observed: Running kg-fuse reset while a shell's cwd was inside the mount (~/Knowledge/...) broke the mount:

Stopping Knowledge Graph FUSE Driver...
  /home/aaron/Knowledge: fusermount failed: fusermount: failed to unmount /home/aaron/Knowledge: Device or resource busy
Started Knowledge Graph FUSE Driver.
  /home/aaron/Knowledge has orphaned mount — run kg-fuse repair first
  All mounts already running.

Result: every access returns Transport endpoint is not connected, and the service exits 0 thinking it's fine.

Recovery required (manual, multi-step):

  1. cd out of the mount (otherwise it stays busy)
  2. kg-fuse repair — and its fusermount -u only succeeds once nothing holds the mount

Proposed: When an unmount hits EBUSY, fall back to a lazy unmount (fusermount -uz) in reset/repair. Lazy unmount detaches the mountpoint immediately and lets the kernel release it once the last reference goes away, so a shell sitting in the directory can't strand the driver. Print a notice when a lazy unmount was used so the behavior isn't silent.

2. systemd unit pointed at the dev venv → update upgrades pipx but the service keeps running old code

Observed: The user systemd unit's ExecStart referenced the development virtualenv binary:

/home/aaron/Projects/ai/knowledge-graph-system/fuse/.venv/bin/kg-fuse mount --foreground

rather than the pipx-installed binary (~/.local/bin/kg-fuse). Consequence: kg-fuse update upgraded the pipx install (0.12.0 → 0.12.1) and reported success, but the service kept running the old code path. The upgrade only took effect after kg-fuse repair rewrote the unit:

STALE UNIT: systemd unit references wrong path
Update to /home/aaron/.local/bin/kg-fuse? [Y/n]  Installed and enabled ...

Why it matters: update says "Updated: 0.12.0 -> 0.12.1" and the user reasonably believes the running driver is upgraded — but there's silent skew between what update touched and what the service executes. The fix only happened by luck because an unrelated repair ran.

Proposed:

  • kg-fuse update should detect when the active unit's ExecStart doesn't match the binary it just upgraded, and either repair the unit or warn loudly ("service still running old binary at — run kg-fuse repair").
  • Consider restarting the service after a successful update so new code actually loads (today update leaves the old daemon running).
  • The unit should be installed against a stable path (the pipx/~/.local/bin binary) from the start; repair already does this when invoked — init/update should keep it that way.

Bonus UX note

kg-fuse repair's [Y/n] prompts are interactive-only; running it non-interactively (script/agent) prints the prompts and reports "Addressed N issues" without actually applying the fixes. A --yes/-y flag (and/or detecting non-TTY and erroring clearly) would make repair scriptable.

Environment

Acceptance criteria

  • kg-fuse reset/repair recover a busy mount cleanly (lazy-unmount fallback) instead of leaving a transport-endpoint orphan
  • kg-fuse update guarantees the running service uses the upgraded binary, or warns clearly when it can't
  • kg-fuse repair is usable non-interactively (e.g. --yes)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestuxUser experience improvements

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions