Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f38d54b85
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
🔐 Codex Security Review
Review SummaryOverall Risk: HIGH Findings[HIGH] Site reassignment can delete rack memberships without rack permission
[HIGH] New bulk mutations bypass site-scoped authorization context
[MEDIUM] Force-clear site moves can deadlock with rack reassignment
[MEDIUM] Existing v1 RPCs are removed or renamed without compatibility shims
NotesNo new pool/wallet rewriting, shell command execution, plugin execution, migrations, or infrastructure changes were present in the reviewed diff. Also consider adding a server-side max item cap for Generated by Codex Security Review | |
There was a problem hiding this comment.
Pull request overview
This PR standardizes fleet hierarchy “parent assignment” RPCs into consistent, bulk-friendly Assign<Children>To<Parent> shapes and closes the miner→rack orphan window by introducing an atomic server-side AssignDevicesToRack operation. It also adds a partial-update rack→site assignment flow and updates the ProtoFleet UI to use the new RPCs.
Changes:
- Added atomic
AssignDevicesToRack(server + client) to replace the prior client-side remove→add orchestration and eliminate partial-state orphaning. - Renamed/normalized site-level device + building assignment RPCs (
AssignDevicesToSite,AssignBuildingsToSite) and addedAssignRacksToSite. - Converted rack→building assignment to a bulk RPC (
AssignRacksToBuilding) and rewired UI workflows to use the new request shapes.
Reviewed changes
Copilot reviewed 47 out of 59 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| server/sqlc/queries/site.sql | Renames ReassignDevicesToSite query to AssignDevicesToSite and updates comments for renamed RPCs. |
| server/sqlc/queries/device_set.sql | Adds RemoveDevicesFromAnyRack query to enable atomic rack reassignment. |
| server/internal/handlers/sites/translate.go | Updates request→domain param translation for renamed/bulk site RPCs. |
| server/internal/handlers/sites/handler.go | Renames device/site + building/site handlers and adds AssignRacksToSite handler. |
| server/internal/handlers/sites/handler_test.go | Updates handler tests for renamed/bulk RPCs and new rack→site flow. |
| server/internal/handlers/sites/handler_stats_test.go | Updates sites.NewService wiring after constructor signature change. |
| server/internal/handlers/middleware/rpc_permissions.go | Updates procedure permission map for renamed/new RPC procedures. |
| server/internal/handlers/deviceset/handler.go | Adds AssignDevicesToRack RPC handler. |
| server/internal/handlers/buildings/translate.go | Updates translation for new bulk AssignRacksToBuilding request shape. |
| server/internal/handlers/buildings/handler.go | Renames handler to AssignRacksToBuilding. |
| server/internal/handlers/buildings/handler_test.go | Updates building handler tests for bulk assignment shape. |
| server/internal/domain/stores/sqlstores/site.go | Renames store method to AssignDevicesToSite. |
| server/internal/domain/stores/sqlstores/site_invariant_integration_test.go | Updates integration tests for renamed site assignment store call. |
| server/internal/domain/stores/sqlstores/collection.go | Adds SQL-backed RemoveDevicesFromAnyRack store method. |
| server/internal/domain/stores/sqlstores/collection_site_placement_integration_test.go | Updates integration tests for renamed site assignment store call(s). |
| server/internal/domain/stores/interfaces/site.go | Renames interface method to AssignDevicesToSite and updates comments. |
| server/internal/domain/stores/interfaces/mocks/mock_site_store.go | Regenerates mock to include AssignDevicesToSite and remove old method. |
| server/internal/domain/stores/interfaces/mocks/mock_collection_store.go | Regenerates mock to include RemoveDevicesFromAnyRack. |
| server/internal/domain/stores/interfaces/collection.go | Extends CollectionStore interface with RemoveDevicesFromAnyRack. |
| server/internal/domain/stores/interfaces/building.go | Updates docs referencing renamed SiteService RPC. |
| server/internal/domain/sites/service.go | Implements AssignBuildingsToSite (bulk) and adds AssignRacksToSite partial-update flow. |
| server/internal/domain/sites/service_test.go | Updates unit tests for renamed/bulk site assignment flows + new rack→site behavior. |
| server/internal/domain/sites/service_stats_test.go | Updates stats tests for sites.NewService signature change. |
| server/internal/domain/sites/models/models.go | Renames/introduces models for bulk assignment flows + rack→site result/params. |
| server/internal/domain/fleetmanagement/collection_test.go | Updates tests to use renamed AssignDevicesToSite store method. |
| server/internal/domain/collection/service.go | Adds domain-level AssignDevicesToRack atomic reassignment behavior + activity logging. |
| server/internal/domain/collection/service_test.go | Adds unit tests for atomic rack reassignment scenarios. |
| server/internal/domain/buildings/service.go | Converts rack→building assignment to bulk AssignRacksToBuilding with validation + deterministic lock order. |
| server/internal/domain/buildings/service_test.go | Updates tests for bulk rack→building assignment behavior. |
| server/internal/domain/buildings/service_stats_test.go | Updates stats test comments referencing renamed SiteService RPC. |
| server/internal/domain/buildings/models/models.go | Introduces bulk rack placement params and replaces singular assignment model. |
| server/generated/sqlc/site.sql.go | Generated sqlc output for renamed AssignDevicesToSite query. |
| server/generated/sqlc/device_set.sql.go | Generated sqlc output for RemoveDevicesFromAnyRack. |
| server/generated/sqlc/db.go | Generated sqlc prepare/close wiring for new/renamed statements. |
| server/generated/grpc/sites/v1/sitesv1connect/sites.connect.go | Generated Connect code for renamed/new site RPC procedures. |
| server/generated/grpc/device_set/v1/device_setv1connect/device_set.connect.go | Generated Connect code for new AssignDevicesToRack RPC. |
| server/generated/grpc/buildings/v1/buildingsv1connect/buildings.connect.go | Generated Connect code for bulk AssignRacksToBuilding RPC. |
| server/cmd/fleetd/main.go | Updates production wiring for sites.NewService to pass collectionStore. |
| proto/sites/v1/sites.proto | Renames ReassignDevicesToSite → AssignDevicesToSite, adds AssignBuildingsToSite + AssignRacksToSite. |
| proto/device_set/v1/device_set.proto | Adds AssignDevicesToRack RPC and request/response messages. |
| proto/buildings/v1/buildings.proto | Replaces singular rack→building RPC with bulk AssignRacksToBuilding + RackPlacement. |
| client/src/protoFleet/features/sites/hooks/useSiteModals.ts | Updates inline TODO reference to renamed site assignment wrapper. |
| client/src/protoFleet/features/sites/hooks/useSiteModals.test.ts | Updates mocks for renamed assignDevicesToSite API. |
| client/src/protoFleet/features/rackManagement/pages/RacksPage.tsx | Rewires rack→building reparent UI to use bulk assignRacksToBuilding. |
| client/src/protoFleet/features/fleetManagement/pages/FleetBuildingsPage.tsx | Rewires building→site reparent UI to use bulk assignBuildingsToSite. |
| client/src/protoFleet/features/fleetManagement/components/MinerActionsMenu/MinerReparentPicker.tsx | Collapses miner→rack flow to single atomic assignDevicesToRack dispatch. |
| client/src/protoFleet/features/fleetManagement/components/FleetLayout/FleetLayout.test.tsx | Updates sites API mock for renamed wrapper. |
| client/src/protoFleet/features/buildings/components/ManageBuildingModal/types.ts | Updates comment to reference AssignRacksToBuilding. |
| client/src/protoFleet/features/buildings/components/ManageBuildingModal/ManageBuildingModal.tsx | Batches rack placement/unplacement via bulk assignRacksToBuilding. |
| client/src/protoFleet/features/buildings/components/BuildingModals/BuildingModals.test.tsx | Updates buildings API mock for renamed bulk wrapper. |
| client/src/protoFleet/api/useDeviceSets.ts | Adds assignDevicesToRack wrapper for new atomic RPC. |
| client/src/protoFleet/api/sites.ts | Renames site wrapper + adds bulk building/site + rack/site assignment wrappers. |
| client/src/protoFleet/api/generated/buildings/v1/buildings_pb.ts | Generated TS protobuf types for bulk AssignRacksToBuilding. |
| client/src/protoFleet/api/buildings.ts | Replaces singular rack→building wrapper with bulk assignRacksToBuilding + RackPlacementInput. |
Standardizes the parent-assignment RPC verb across the fleet hierarchy: all hierarchical assignment RPCs now use Assign* (no Reassign/Set/Move variants). No semantic change — the existing optional target_site_id already encodes both initial assignment and re-parenting via the same null-equals-unassigned convention. Sets up the naming convention used by the new RPCs landing in subsequent commits (AssignDevicesToRack, AssignDevicesToBuilding, AssignRacksToSite). Refs #420.
Promotes the building→site assignment RPC to a bulk shape so operators can move multiple buildings between sites in one atomic transaction. A single-building caller passes a one-element `building_ids` array; the response aggregates cascade counts across every building. The store-level operation stays singular (one UPDATE per building); the service layer sorts by id for deterministic lock ordering then loops inside a single tx — if any building fails, the whole batch rolls back. Updates the in-flight FleetBuildingsPage row-action call site (#412) to pass the new array shape. Refs #420.
…ilding
Promotes the rack→building assignment RPC to a bulk shape so the
ManageBuildingModal save flow (which previously fired N concurrent
singular calls) can land an entire grid layout in two atomic batches
— vacate first, then place — instead of N independent transactions.
Single-rack callers (RacksPage row action) pass a one-element array.
Request shape: repeated RackPlacement { rack_id, optional
aisle_index, optional position_in_aisle } + optional
target_building_id. Each entry carries its own placement so a single
call can mix placed and cleared positions.
Service-layer behavior preserved per rack — building lock once,
canonical lock order (building → rack id ascending), per-rack zone
clear / cascade / position write. If any rack fails, the whole tx
rolls back.
Activity-log payload now carries rack_ids[] + aggregate cascade
counts; consumers reading the singular rack_id metadata key will need
to migrate (no existing consumers in tree).
Refs #420.
Closes the orphan window in the miner reparent flow. Previously the client orchestrated removeDevicesFromDeviceSet(source) + addDevicesToDeviceSet(target) as two independent RPCs; a server error or network blip between them left miners removed from their source rack but never added to the target. The cross-rack rename race (source rack renamed mid-confirm) had the same failure shape via the client-side label→id resolution. The new RPC bundles both halves into one server-side transaction: remove from any current rack (single SQL DELETE on device_set_membership filtered by device_set_type='rack'), insert into target rack, cascade device.site_id when the target rack's site differs. target_rack_id unset clears rack membership without re-assigning (site/building stay intact). MinerReparentPicker's rack path collapses from ~50 lines of remove-then-add orchestration + listRacks label resolution to a single dispatch. Site reparent path is unchanged. Server: new sqlc query RemoveDevicesFromAnyRack, new CollectionStore.RemoveDevicesFromAnyRack, new collection.Service.AssignDevicesToRack with 4 unit tests covering happy path, unassign, target-must-be-rack, and empty-input rejection. Refs #420.
Adds the missing rack→site reparent RPC called out in #420. SaveRack was a full-replace requiring the rack's label, layout, members, and slot assignments to round-trip on every site change; the new RPC is a partial update — only site_id changes. Behavior: - target_site_id unset → racks move to "Unassigned" - building_id is auto-cleared on any site transition (a building belongs to one site, so the move invalidates building membership); the response carries the count of racks whose building was cleared so the UI can prompt the operator to pick a building in the new site - same transaction cascades device.site_id for every rack member - lock order: target site → each rack id ascending (matches AssignBuildingsToSite + AssignDevicesToSite) - bulk-friendly: pass repeated rack_ids; the batch rolls back on any per-rack failure Plumbs collectionStore through sites.Service so the new method can reuse LockRackPlacementForWrite + UpdateRackPlacement + CascadeRackDeviceSites instead of duplicating the rack-placement write path. Server: 4 handler tests (cascade + clear building, target-unset unassign, same-site no-op, bulk aggregation with sorted lock order), 2 service tests (empty rejection, nil-store guard). Client: useSites().assignRacksToSite wrapper. Refs #420.
CI's generated-code check failed because the local regen on the naming-normalization commits used protoc-gen-es 2.11.0 (a sibling worktree's cached node_modules) instead of the pinned 2.12.0. The 2.12.0 output annotates optional Timestamp fields as `Timestamp | undefined` rather than `Timestamp?`. Also fixes a prettier ternary formatting issue prettier --write applied to MinerReparentPicker.tsx during CI's gen pass that didn't run locally. No semantic change.
… family Cross-cutting review fixes for the atomic reparent PR. Orphan-window closure: - AssignDevicesToSite: optional force_clear_conflicting_rack_membership flag clears rack memberships in the same tx; MinerReparentPicker collapses to a single call - AssignDevicesToRack: source-rack FOR UPDATE lock pre-pass serializes concurrent overlapping reparent calls; LockRackPlacementForWrite reordered before GetCollection so activity-log labels read under lock - RemoveDevicesFromAnyRack: target-rack exclusion predicate prevents cascade-delete of rack_slot rows on no-op reassigns Reparent surface cleanup: - AssignDevicesToRack switched to DeviceSelector - AddDevicesToGroup / RemoveDevicesFromGroup new RPCs replace the generic collection/v1 device-membership pair; type-checked at the service boundary, atomic clear-then-add for racks - collection/v1 AddDevicesToCollection / RemoveDevicesFromCollection RPCs and Service methods removed; fleetimport migrated to the type-specific methods - DeviceSetService added to reflectEnabledServices Bulk SQL batch family (N+1 elimination): - AssignRacksToBuilding: 5N -> 4 writes (constant) - AssignBuildingsToSite: 4N -> N+3 - AssignRacksToSite: 3N -> N+2 - Sequential sorted-order locks preserved (deadlock invariant) ManageBuildingModal collapse: - handleSave two-phase vacate+place replaced by a single RPC - Server-side two-pass ordering (clear-all then place-all in one tx) handles position swaps without the partial-failure window Frontend reliability: - MinerReparentPicker: AbortSignal threading, Promise wrapping, double- invoke guard, fire-and-forget RPCs awaited - useDeviceSets.assignDevicesToRack: AbortSignal support - useSiteModals.detailsSaveEdit: functional setState removes stale- closure guard race - ManageBuildingModal handleSave: savingRef double-invoke guard Validation + prior-comment hygiene: - AssignBuildingsToSite / AssignRacksToSite reject TargetSiteID == 0 - AssignRacksToBuilding rejects duplicate rack_ids - ScopeCount uses len(DeviceIdentifiers) instead of double-counting - Single-rack building-only unassign captures the rack's current SiteID Test coverage: - AssignRacksToSite service-layer tests (cascade, no-op, rollback) - AssignDevicesToRack handler tests (perm, happy, unassign, source-rack lock ordering) - AssignRacksToBuilding rollback + empty-rejected + duplicate-rejected - AssignBuildingsToSite empty-rejected - DeviceSelector variant rejection - AddDevicesToGroup / RemoveDevicesFromGroup rack-target rejection - Bulk-write large-batch stress tests across three RPCs Handler hygiene: - deviceset/translate.go (new) extracts toAssignDevicesToRackParams - sites/translate.go adds toAssignRacksToSiteParams Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Picks up main's curtailment automation query registrations alongside the branch's new bulk SQL queries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 767a22b171
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2ced5b8211
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
LockSourceRacksForDevices: Postgres rejects DISTINCT + FOR UPDATE. Rewrite with subquery-derived distinct ids and outer FOR UPDATE SELECT keyed on device_set.id. Every AssignDevicesToRack call was failing at runtime in prod; mock-only tests didn't catch it. AssignDevicesToSite force-clear: partition conflicts into clearable (DEVICE_IN_RACK_AT_OTHER_SITE) vs residual. Abort BEFORE any RemoveDevicesFromAnyRack deletion when residual non-clearable conflicts (e.g. DEVICE_NOT_FOUND) remain. When clearable: pass only the conflicting identifiers so devices already at the target site keep their rack/rack_slot rows. ManageBuildingModal.handleSave: serialize unassign before in- building dispatch. F5's two-pass clear-then-place ordering only applies within one AssignRacksToBuilding call. With concurrent calls, the in-building call could hit the partial unique index before the unassign committed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 211beb85b4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| out.SiteReassignedDeviceCount = count | ||
| cascadeRan = true | ||
| out.SiteReassignedDeviceCount += count |
There was a problem hiding this comment.
Reset rack-building counts on tx retry
Because out, cascadeRackIDs, and positionedRackIDs are declared outside the RunInTx closure, a retryable transaction failure after this cascade can leave the rolled-back attempt's state in memory and then add the committed attempt's count again. SQLTransactor.RunInTxWithResult delegates to db.WithTransaction, which reruns closures on retryable PostgreSQL/deadlock errors, so a rack move that cascades device sites can return/log an inflated site_reassigned_device_count even though only one transaction committed; keep these tallies in per-attempt locals and copy them out only after the successful attempt.
Useful? React with 👍 / 👎.
| changedRackIDs = append(changedRackIDs, rackID) | ||
| cascadedRackIDs = append(cascadedRackIDs, rackID) | ||
| if current.BuildingID != nil { | ||
| clearedCount++ |
There was a problem hiding this comment.
Reset rack-site counts on tx retry
This counter is outside the RunInTx closure, alongside deviceCount and cascadedRackIDs, so any retryable failure after a rack has been classified as site-changing can make the next transaction attempt reuse the rolled-back attempt's counts/metadata. Since the production transactor reruns the closure on retryable PostgreSQL/deadlock errors, AssignRacksToSite can report/log too many cleared buildings or reassigned devices after a retry; accumulate into tx-local variables and publish them only from the successful attempt.
Useful? React with 👍 / 👎.
Codex P2 review: excluding the target rack from the sorted source-
lock query and then locking it separately via LockRackPlacementForWrite
produces inconsistent global lock order. Two concurrent calls moving
devices in opposite directions between the same rack pair can deadlock
(Tx A holds rack 2 waits on rack 1; Tx B holds rack 1 waits on rack 2).
Renamed LockSourceRacksForDevices → LockRacksForReparent and changed
semantics: the query now UNIONs the target rack id into the lock set
when present, so both txs lock {1, 2} in {1, 2} order regardless of
which is source vs target. The membership DELETE in RemoveDevicesFromAnyRack
still excludes the target via its own predicate — this query is purely
about lock acquisition order.
LockRackPlacementForWrite stays on the target path because it also
locks the device_set_rack child row; the parent-row re-acquire is a
no-op since the same tx already holds it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Closing in favor of the 4 split PRs for easier review:
Dependency graph: PRs 2-4 will rebase off main once PR 1 lands. PRs 3 and 4 are parallel-mergeable after PR 2 lands. |
Summary
Closes the orphan window and rename race in the miner reparent flow (issue #420) and normalizes the parent-assignment surface across the fleet hierarchy.
Architectural picture. The fleet has three levels of parent-assignment (site, building, rack). Before this PR, each level had its own RPC verb, shape, and arity (some singular, one prefixed
Reassign, one with anadd+removeclient-side orchestration). The matrix is now consistent: every parent-assignment is oneAssign<Children>To<Parent>RPC, bulk-friendly (repeatedchild ids), and null-aware for unassignment (optional target_*_idunset → "Unassigned"). All five RPCs land their write inside a single transaction.Orphan window + rename race. The miner→rack reparent flow previously orchestrated
RemoveDevicesFromDeviceSet(source)followed byAddDevicesToDeviceSet(target)as two independent calls. A server error or network blip between them left miners removed from their source rack but never added to the target; concurrent rename of the source rack between the client's label→id resolution and the remove call caused the same partial-state outcome. The newAssignDevicesToRackRPC bundles both halves into one server-side transaction with rack identity resolved under lock, closing both holes; the client orchestration collapses from ~70 lines of remove-then-add looping to a single dispatch.Partial-update rack→site reparent.
SaveRackpreviously required a full round-trip of label / layout / members / slot assignments on every site change. The newAssignRacksToSiteis a partial update that only touches site_id; building_id is auto-cleared on site transition (a building belongs to a single site) and the response reports cleared-building count so callers can prompt for re-assignment.Non-goals / deferred
AssignDevicesToBuilding+device.building_idcolumn). Adding a direct FK for building requires five cascade invariants on existing RPCs (rack→building, rack→site, save rack, devices→rack, devices→site) to keep the new column in sync with the rack-derived view. The migration and cascade design exist on a working branch but are deferred to a follow-up PR to keep this one reviewable. Until then, miner→building goes through the existing rack-picker flow.🤖 Generated with Claude Code