Skip to content

fix: correctly attribute AI lines in merge commit stats#913

Open
svarlamov wants to merge 1 commit intomainfrom
devin/1775066168-fix-merge-commit-stats
Open

fix: correctly attribute AI lines in merge commit stats#913
svarlamov wants to merge 1 commit intomainfrom
devin/1775066168-fix-merge-commit-stats

Conversation

@svarlamov
Copy link
Copy Markdown
Member

@svarlamov svarlamov commented Apr 1, 2026

Summary

Fixes #910. When AI resolves a merge conflict, git ai blame correctly attributes lines to the AI, but git ai stats head was showing 100% human / 0% AI. Three root causes:

  1. accepted_lines_from_attestations() early-returned (0, empty) for all merge commits, unconditionally skipping attestation matching.
  2. stats_for_commit_stats() set added_lines_by_file to an empty HashMap for merge commits, so there were no lines to match against attestations even if the early return were removed.
  3. git show --numstat uses combined-diff format for merge commits, which only shows files differing from ALL parents (conflict resolutions), missing cleanly-merged changes. Replaced with git diff commit^1 commit --numstat for merge commits.

What changed

  • Removed the is_merge_commit parameter and early return from accepted_lines_from_attestations()
  • added_lines_by_file is now always computed by diffing against the first parent (works for both regular and merge commits)
  • New get_git_diff_stats_first_parent() helper uses git diff commit^1..commit --numstat
  • Extracted parse_numstat_output() to share parsing logic between the two numstat paths
  • Two new TestRepo integration tests (conflicting + non-conflicting merge with AI checkpoint)
  • Updated existing unit tests to reflect corrected behavior

Review & Testing Checklist for Human

  • Double-counting in range stats: Merge commits now report git_diff_added_lines > 0 for all changes vs first parent. If git ai stats <range> sums across both the feature branch commits AND the merge commit, the same lines could be counted twice. Verify how range stats aggregation works and whether this is an issue.
  • Octopus merges: The fix uses commit^1 (first parent). Verify this is correct for octopus merges (3+ parents) — should it always be the first parent?
  • Behavioral change for merge commits without AI notes: Previously, merge commits with no authorship log showed human_additions=0, git_diff_added_lines=0. Now they show the actual first-parent diff (e.g., human_additions=1, git_diff_added_lines=1). Confirm this is the desired behavior.
  • Manual verification: On a real repo with AI-resolved merge conflicts (like the one in issue AI-resolved merge conflict results in human stats, but blame is correct #910), run git ai stats head and confirm the output now shows AI attribution correctly.

Notes

  • The integration tests simulate the real-world flow: git merge --no-commit -X theirsgit ai checkpoint mock_ai <file>stage_all_and_commit. This mirrors what happens when an AI tool resolves conflicts and git-ai's checkpoint mechanism tracks it.
  • All 2562 integration tests + 1249 unit tests pass. Lint and format checks are clean.

Link to Devin session: https://app.devin.ai/sessions/6bd8df4f03d94a51a75575a7769f9b4f
Requested by: @svarlamov


Open with Devin

Fixes #910

When AI resolves a merge conflict, `git ai blame` correctly attributes
lines to the AI, but `git ai stats head` was incorrectly showing 100%
human / 0% AI additions. Three root causes:

1. `accepted_lines_from_attestations()` had an early return that always
   returned (0, empty) for merge commits, skipping attestation matching.

2. `stats_for_commit_stats()` set `added_lines_by_file` to an empty
   HashMap for merge commits, so even without the early return there
   were no lines to match against attestations.

3. `git show --numstat` uses the combined-diff format for merge commits,
   which only shows files that differ from ALL parents (conflict
   resolutions). This misses cleanly-merged changes. The fix uses
   `git diff commit^1 commit --numstat` to diff against the first
   parent instead.

Changes:
- Remove `is_merge_commit` parameter and early return from
  `accepted_lines_from_attestations()`
- Always compute `added_lines_by_file` by diffing against the first
  parent (works for both regular and merge commits)
- Add `get_git_diff_stats_first_parent()` helper for merge commit
  numstat via `git diff commit^1 commit --numstat`
- Extract `parse_numstat_output()` to share parsing logic between
  the two numstat code paths
- Add two TestRepo-based integration tests that replicate the bug
- Update unit tests to reflect the corrected behavior

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AI-resolved merge conflict results in human stats, but blame is correct

2 participants