Skip to content

chore(api): simplify sandbox removing logic#2016

Draft
jakubno wants to merge 3 commits intomainfrom
chore/simplify-kill-logic
Draft

chore(api): simplify sandbox removing logic#2016
jakubno wants to merge 3 commits intomainfrom
chore/simplify-kill-logic

Conversation

@jakubno
Copy link
Member

@jakubno jakubno commented Feb 27, 2026

Note

Medium Risk
Touches core sandbox lifecycle paths (kill/pause/eviction/snapshot) and changes state-transition APIs, so regressions could affect cleanup correctness and cross-team isolation if team IDs or transition results are mishandled. Behavior is mostly refactoring, but it spans both in-memory and Redis-backed transition logic.

Overview
This PR simplifies sandbox removal by routing kill/pause/eviction flows through a unified RemoveSandbox(ctx, teamID, sandboxID, action) path and pushing team-ownership checks into the sandbox store’s StartRemoving call, removing handler-side GetSandbox/mismatch logic. It also changes StartRemoving to return the current sandbox snapshot alongside the transition callback and updates Redis transitions to avoid mutating the originally-fetched sandbox on failure while keeping transient-transition signaling consistent; reviewers should sanity-check that all callers pass the correct teamID, that NotFound vs transition errors still map to the intended HTTP responses, and that metrics/analytics decrements still occur exactly once per successful removal.

Written by Cursor Bugbot for commit 4c3aaff. This will update automatically on new commits. Configure here.

// Should timeout
require.Error(t, err2)
require.ErrorIs(t, err2, context.DeadlineExceeded)
assert.NotNil(t, callback)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert.NotNil(t, callback) is incorrect and will fail. When waitForTransition returns context.DeadlineExceeded, handleExistingTransition returns nil for the callback:

err := s.waitForTransition(ctx, teamID, sbx.SandboxID, transactionID)
if err != nil {
    return sbx, false, nil, fmt.Errorf("failed waiting for transition: %w", err)
}

The callback will always be nil on the error path, so this assertion should be assert.Nil(t, callback).

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.


alreadyDone, callback, err := startRemoving(ctx, sbx, stateAction)

return data, alreadyDone, callback, err
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory StartRemoving returns stale pre-mutation sandbox data

Medium Severity

The memory StartRemoving captures data via sbx.Data() before calling startRemoving, which mutates State and possibly EndTime. It then always returns this stale data, even on success. The Redis implementation correctly returns an updated copy with the post-mutation state on success. This means the two implementations of the same Storage interface return different sandbox data for the same successful operation — memory returns old state (e.g. Running), Redis returns new state (e.g. Killing).

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants