Skip to content

feat: add cloud_default_ai_attribution feature flag for cloud env attribution#790

Open
svarlamov wants to merge 17 commits intomainfrom
devin/1774410513-cloud-default-ai-attribution
Open

feat: add cloud_default_ai_attribution feature flag for cloud env attribution#790
svarlamov wants to merge 17 commits intomainfrom
devin/1774410513-cloud-default-ai-attribution

Conversation

@svarlamov
Copy link
Copy Markdown
Member

@svarlamov svarlamov commented Mar 25, 2026

Summary

Adds a new cloud_default_ai_attribution feature flag that, when enabled, attributes human checkpoint changes to the most recent AI checkpoint's agent identity instead of marking them as human. This is a more performant alternative to bash tool call tracking for cloud coding agent platforms.

Behavior when enabled for human checkpoints:

  1. The checkpoint kind is promoted from Human to AiAgent early in get_checkpoint_entries(), so the working log and all downstream processing treat it as a genuine AI checkpoint.
  2. If a previous AI checkpoint exists → use its agent_id with the current timestamp
  3. If no previous AI checkpoint → detect the cloud environment tool (cursor-agent, claude-web, devin, etc.) via env vars / filesystem
  4. If no cloud env detected → fall back to "unknown" tool with "cloud-default" id

Auto-detection: The flag auto-enables when is_in_background_agent() returns true (existing cloud env detection), unless explicitly overridden via GIT_AI_CLOUD_DEFAULT_AI_ATTRIBUTION env var or config file.

Files changed:

  • src/utils.rs — new centralized detect_background_agent_tool() -> Option<String> that is the single source of truth for all agent-name ↔ env-var mappings; is_in_background_agent() now delegates to it
  • src/feature_flags.rs — new flag definition (debug=false, release=false)
  • src/config.rs — auto-enable logic in build_feature_flags()
  • src/commands/checkpoint.rsresolve_cloud_attribution_for_human(), run_with_captured_cloud_env_tool(), modified get_checkpoint_entries() to return 4-tuple (entries, file_stats, effective_kind, Option<AgentId>), early kind promotion from HumanAiAgent, guard in resolve_live_checkpoint_execution(), captured_cloud_env_tool propagation for both captured-async and live daemon paths
  • src/commands/git_ai_handlers.rs — wrapper detects cloud env tool and passes it in LiveCheckpointRunRequest
  • src/daemon/control_api.rscaptured_cloud_env_tool field on LiveCheckpointRunRequest
  • src/daemon.rsapply_checkpoint_side_effect calls run_with_captured_cloud_env_tool to propagate the detected tool
  • tests/integration/repos/test_repo.rsinject_feature_flag_env() wired into both configure_command_env and configure_git_ai_env; set_feature_flags() now syncs flags to config file and restarts dedicated daemons
  • tests/integration/cloud_attribution.rs — 11 integration tests (all using dedicated daemons)
  • tests/integration/main.rs, performance.rs — module registration + struct field addition
  • tests/daemon_mode.rs — added captured_cloud_env_tool: None to existing PreparedCheckpointManifest literals

Updates since initial revision

1–7. (See git history for earlier iterations: Devin Review fixes, rebase against PR #743, per-file fast path fixes, daemon mode config sync, dedicated daemon restart, captured-async env propagation, live checkpoint path env propagation.)

  1. Kind promotion refactor — cloud-attributed checkpoints use AiAgent kind:

    • Previously, cloud-attributed checkpoints kept CheckpointKind::Human and set agent_id to the cloud agent. This created a semantic mismatch: downstream code (post-commit processing, virtual attribution, metrics) treats Human checkpoints specially (skips DB upsert, skips metrics, filters attributions differently).
    • Now, get_checkpoint_entries() resolves an effective_kind early: if cloud attribution is active, the kind is promoted to AiAgent. This effective_kind is passed to per-file processing and returned to execute_resolved_checkpoint, which uses it for Checkpoint::new() and all downstream logic.
    • Removed 2 scattered cloud_default_ai_attribution guards from the per-file fast paths in get_checkpoint_entry_for_file() — they're no longer needed because kind == CheckpointKind::Human is false when the effective kind is AiAgent.
    • 1 guard remains in resolve_live_checkpoint_execution() (the top-level pre-commit early exit) — this is still needed because it runs before get_checkpoint_entries() where the kind promotion happens.
    • Tests updated to assert AiAgent kind (not Human) for cloud-attributed checkpoints.
  2. Consolidated cloud env detection into single source of truth:

    • Merged the duplicated detect_cloud_env_tool() (checkpoint.rs) and is_in_background_agent() (utils.rs) into a single detect_background_agent_tool() -> Option<String> in utils.rs.
    • is_in_background_agent() now delegates to detect_background_agent_tool().is_some().
    • All callers use the centralized function with .unwrap_or_else(|| "unknown".to_string()) when a fallback is needed.
    • Also folded in GIT_AI_CLOUD_AGENT == "1""cloud-agent" mapping that was previously only in is_in_background_agent().
    • Detection priority: CLOUD_AGENT_TOOL (explicit) > CURSOR_AGENT > CLAUDE_CODE_REMOTE > GIT_AI_CLOUD_AGENT (explicit opt-in) > CLOUD_AGENT_* prefix > /opt/.devin path.
  3. Fixed explicit opt-out bypass:

    • Both wrapper paths (live daemon + captured-async) now only send captured_cloud_env_tool when the feature flag is on. Previously an else if branch sent the detected tool even when the flag was off, which combined with the downstream pre_detected_cloud_env_tool.is_some() check, overrode explicit user opt-out.
  4. Fixed cloud-attributed checkpoint log message:

    • agent_tool in execute_resolved_checkpoint now falls back to cloud_agent_id.tool when agent_run_result is absent (which is the case for promoted human checkpoints). Previously, the log message fell back to the human author name, producing misleading output like "ai_agent John changed 1 file(s)..." instead of "ai_agent cursor-agent changed 1 file(s)...".
  5. Fixed daemon pre-commit early-exit ignoring captured_cloud_env_tool:

    • resolve_live_checkpoint_execution now accepts captured_cloud_env_tool: Option<&str> and includes && captured_cloud_env_tool.is_none() in the pre-commit skip condition. Previously, when a daemon's frozen Config didn't have the flag enabled (started before cloud env was established), the early-exit would silently drop the checkpoint before captured_cloud_env_tool was ever consulted in execute_resolved_checkpoint. Now the wrapper's signal prevents the skip.

Review & Testing Checklist for Human

  • Verify effective_kind promotion doesn't trigger unintended AI-only paths: With the kind now AiAgent, cloud-attributed checkpoints flow through ALL AI checkpoint code paths. Confirm: (a) the prompt DB upsert at L793 is safe (requires transcript.is_some() which should be None for promoted checkpoints), (b) agent_usage metrics emission at L826 is desired behavior, (c) virtual attribution filtering correctly handles these checkpoints.
  • Verify explicit opt-out works end-to-end in daemon mode: Set GIT_AI_CLOUD_DEFAULT_AI_ATTRIBUTION=false in a real cloud environment (where is_in_background_agent() would auto-enable the flag). Confirm human checkpoints remain Human kind with no agent_id. The wrapper should NOT send captured_cloud_env_tool when the flag is off.
  • Verify detect_background_agent_tool() OnceLock behavior: The detection is cached in a OnceLock<Option<String>>, meaning it's frozen after the first call within a process. Confirm this is acceptable — in particular, the daemon process calls it once at startup (returns None since daemon has no cloud env vars), but this is fine because daemon callers use pre_detected_cloud_env_tool from the wrapper instead.
  • Test end-to-end in a real cloud environment (e.g., Devin, Cursor background agent) to confirm the auto-detection and attribution flow works correctly. Verify the working log records ai_agent kind (not human) for cloud-attributed checkpoints, and that the log message shows the correct tool name.
  • Verify prepare_captured_checkpoint passes None correctly: The captured-async path passes None for captured_cloud_env_tool to resolve_live_checkpoint_execution. This is correct because captured checkpoints are for AI agent runs (not human), but confirm there's no edge case where a pre-commit captured human checkpoint with no AI edits would be silently dropped.

Notes

  • detect_background_agent_tool() is the single source of truth for all agent name ↔ env var mappings, used by both is_in_background_agent() (bool) and cloud tool detection (tool name string).
  • The fallback AgentId uses model: "unknown" — when reusing a previous AI checkpoint's agent_id, the model field preserves whatever that checkpoint had.
  • Detection priority order: CLOUD_AGENT_TOOL (explicit) > CURSOR_AGENT > CLAUDE_CODE_REMOTE > GIT_AI_CLOUD_AGENT (explicit opt-in) > CLOUD_AGENT_* prefix > /opt/.devin path > "unknown".
  • The test_cloud_attribution_no_ai_checkpoint_uses_fallback test checks for non-empty tool name rather than asserting "unknown" specifically, because the CI/test environment may have /opt/.devin present.
  • Both captured_cloud_env_tool fields (on PreparedCheckpointManifest and LiveCheckpointRunRequest) use #[serde(default)] for backward compatibility with messages/manifests created before this change.
  • In the daemon path, pre_detected_cloud_env_tool.is_some() acts as a trust-the-wrapper signal: the wrapper only sends the tool when the flag was on in the wrapper's Config, so the daemon can bypass its own (potentially stale) flag check.

Link to Devin session: https://app.devin.ai/sessions/9fad5875f02c40fba5afaaed02db55f8
Requested by: @svarlamov


Open with Devin

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Pre-commit early exit skips checkpoint entirely when cloud_default_ai_attribution is enabled

The early exit at lines 174-180 returns Ok((0, 0, 0)) when is_pre_commit is true and there are no AI edits, no initial attributions, and inter_commit_move is off. This check does not account for cloud_default_ai_attribution. When a cloud/background agent (e.g., Cursor background, Claude web) makes changes but doesn't explicitly call git-ai checkpoint, and the user then runs git commit, the pre-commit hook triggers checkpoint::run() with kind=Human and is_pre_commit=true. Since there are no prior AI checkpoints, has_no_ai_edits is true, and the entire checkpoint is skipped — no working log entry is created and no attribution note will be generated by the post-commit hook. This defeats the purpose of cloud_default_ai_attribution, which is auto-enabled in these exact cloud environments via src/config.rs:785.

(Refers to lines 174-180)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in d52ab8c. Added && !Config::get().get_feature_flags().cloud_default_ai_attribution to the early exit condition so pre-commit human checkpoints are not skipped when the feature is enabled.

// When cloud_default_ai_attribution is enabled and this is a human checkpoint,
// attribute changes to the most recent AI checkpoint's agent instead.
let cloud_attribution = if kind == CheckpointKind::Human
&& Config::get().feature_flags().cloud_default_ai_attribution
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Uses feature_flags() instead of get_feature_flags(), bypassing test override mechanism

Line 1209 uses Config::get().feature_flags().cloud_default_ai_attribution while the established pattern in the same file (lines 176, 911) is Config::get().get_feature_flags(). In test mode (cfg(test) or feature = "test-support"), get_feature_flags() at src/config.rs:448 checks TEST_FEATURE_FLAGS_OVERRIDE for test overrides, while feature_flags() at src/config.rs:309 returns the raw config value and ignores test overrides. This inconsistency means any in-process unit test that calls Config::set_test_feature_flags() to enable cloud_default_ai_attribution would have no effect on this code path. The current integration tests work because they inject env vars into subprocesses, but this violates the codebase convention and could cause subtle test failures in the future.

Suggested change
&& Config::get().feature_flags().cloud_default_ai_attribution
&& Config::get().get_feature_flags().cloud_default_ai_attribution
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d52ab8c — changed to get_feature_flags() to respect test overrides via TEST_FEATURE_FLAGS_OVERRIDE.

…ribution

When enabled, human checkpoints are attributed to the most recent AI
checkpoint's agent_id instead of being marked as human. This is a more
performant alternative to bash tool call tracking for cloud coding agent
platforms.

The feature:
- Auto-enables when git-ai detects a cloud/background agent environment
- Falls back to cloud env tool detection (cursor-agent, claude-web, devin, etc.)
  when no previous AI checkpoint exists
- Falls back to 'unknown' when no cloud env can be detected
- Can be explicitly enabled/disabled via GIT_AI_CLOUD_DEFAULT_AI_ATTRIBUTION env var
- Respects explicit config file settings over auto-detection

Changes:
- src/feature_flags.rs: Add cloud_default_ai_attribution flag
- src/config.rs: Auto-enable flag in cloud environments
- src/commands/checkpoint.rs: Implement attribution logic with detect_cloud_env_tool()
  and resolve_cloud_attribution_for_human()
- tests/integration/repos/test_repo.rs: Wire up feature flag injection
- tests/integration/cloud_attribution.rs: 11 comprehensive integration tests
- tests/integration/main.rs: Register new test module
- tests/integration/performance.rs: Add new field to FeatureFlags initializer

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
@devin-ai-integration devin-ai-integration bot force-pushed the devin/1774410513-cloud-default-ai-attribution branch from ea79240 to f230851 Compare March 25, 2026 14:39
devin-ai-integration[bot]

This comment was marked as resolved.

…st path

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 3 commits March 25, 2026 15:14
…th and fix inter_commit_move config key

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
…emon on set_feature_flags

The daemon process reads config via a OnceLock singleton frozen at startup.
Shared daemons cannot pick up feature flag changes made after construction.

- Switch all cloud attribution tests to TestRepo::new_dedicated_daemon()
- In set_feature_flags(), shut down and restart the dedicated daemon so
  it re-reads the updated config file with the new feature flags

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
…to daemon

In daemon mode, env vars passed to the wrapper process (e.g. CURSOR_AGENT,
CLAUDE_CODE_REMOTE) don't reach the daemon's detect_cloud_env_tool() call.

Fix: capture the cloud env tool in the wrapper during
prepare_captured_checkpoint() and propagate it through the
PreparedCheckpointManifest to the daemon's execution path, where it's
used as a pre-detected fallback in resolve_cloud_attribution_for_human().

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 4 commits March 25, 2026 16:06
…ct_feature_flag_env

Address Devin Review feedback: async_mode was missing from both the
config-writing logic in set_feature_flags() and the env var injection
in inject_feature_flag_env(), breaking the pattern established for all
other feature flags.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
…st for daemon mode

In daemon mode, human checkpoints go through the live checkpoint path
(not the captured async path). The wrapper process has the cloud agent
env vars but the daemon doesn't. This adds captured_cloud_env_tool to
LiveCheckpointRunRequest so the wrapper can detect the cloud env tool
and pass it to the daemon for use in cloud attribution.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
…ing scattered guards

Instead of keeping checkpoint kind as Human and adding cloud_default_ai_attribution
guards at 3 fast paths, resolve the effective kind to AiAgent early in
get_checkpoint_entries when cloud attribution is active. This:

- Removes per-file pre-commit fast path guard
- Removes per-file non-pre-commit fast path guard
- Removes the separate cloud_agent_id setting in execute_resolved_checkpoint
- Makes the working log correctly record ai_agent instead of human
- Ensures downstream post-commit and virtual attribution processing
  treats cloud-attributed checkpoints as AI throughout

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits March 25, 2026 18:32
…ent_tool in utils.rs

Merges the duplicated env-var/filesystem detection logic from
detect_cloud_env_tool() (checkpoint.rs) and is_in_background_agent()
(utils.rs) into a single detect_background_agent_tool() -> Option<String>
in utils.rs. This is now the single source of truth for all agent-name
to env-var mappings.

Also addresses the daemon config singleton issue: when the wrapper sends
pre_detected_cloud_env_tool to the daemon, treat it as implying cloud
attribution is enabled (the wrapper already decided it applies), so a
daemon whose Config was frozen without the flag still attributes correctly.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…tribution

When the cloud_default_ai_attribution flag is off and user is not in a
cloud env, detect_background_agent_tool() returns None which was wrapped
as Some("unknown") unconditionally, causing all human checkpoints to be
promoted to AiAgent via the pre_detected_cloud_env_tool.is_some() check.

Now both the daemon live path and captured checkpoint path use a three-way
branch: flag on → Some(tool-or-unknown), flag off + human → raw Option
from detection (None when not in cloud), not human → None.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits March 25, 2026 18:55
…pt-out

The else-if branch sent the detected cloud tool to the daemon even when
the flag was explicitly off, which combined with the downstream
pre_detected_cloud_env_tool.is_some() check caused all human checkpoints
in cloud environments to be promoted to AiAgent regardless of opt-out.

The wrapper only needs to send captured_cloud_env_tool when the flag is
on (which auto-enables in cloud envs). The daemon's is_some() check then
correctly acts as a trust-the-wrapper signal for the frozen-singleton case.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits March 25, 2026 20:04
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
…xecution to prevent pre-commit early-exit

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants