Skip to content

investigate: source of .git/index.lock races during hook execution #19

@aaronsb

Description

@aaronsb

Symptom

During normal agent activity, `git commit` and other foreground git operations intermittently fail with:

```
fatal: Unable to create '/home/.../.git/index.lock': File exists.
```

The lock file appears briefly and disappears. "Happens a lot" in practice.

What we know isn't the cause

Grepped the hook code paths that run on every interaction:

  • `check-bash-pre.sh` / `check-prompt.sh` / `check-state.sh` / `check-file-pre.sh` → all call `ways scan …`
  • `tools/ways-cli/src/cmd/scan/` has zero `Command::new("git")` invocations. scan never shells out to git.

The only runtime git calls found in ways-cli are in `cmd/show/metrics.rs` (`git_version`, `dirty_status_text`) — only reached from `ways show`, not from hooks.

`sensor-git` runs git commands but only while `attend run` is active. Races were observed with no `attend` process running.

Mitigation already shipped

Commit 23a8555 — `GIT_OPTIONAL_LOCKS=0` exported from `require-ways.sh` (covers all sourced hooks) and added as `.env()` to the 3 runtime git call sites in `ways-cli` + `sensor-git`. This makes read-ish git operations skip `.git/index.lock` at the cost of an occasional stale stat cache. Fine for every caller in this repo.

This is defense in depth, not a root-cause fix. If races persist after this ships, the real source is still out there.

What to investigate next

  1. `fatrace` or `inotifywait` on `.git/index.lock` — capture every PID that creates the file, with timestamp. Compare against active tool calls.
  2. Wrap the ways binary with a tracing shim (`strace -f -e execve -p …` on a session) to catch any git subprocess that didn't show up in the source grep.
  3. Check for background daemons — `gitstatusd`, zsh's `vcs_info`, fsmonitor (`fsmonitor-watchman.sample` exists but as a sample). Shell prompts that integrate with git commonly take the lock.
  4. Check hook re-entrance — is `check-bash-pre.sh` firing while its own `ways scan` child process is still running? Overlapping hook invocations would multiply git activity.

Acceptance

Either:

  • Identify the concrete caller and fix it (preferred), or
  • Confirm the mitigation is sufficient after a few days of normal use and close.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions