-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add Support to tag a minion as Drained #17375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #17375 +/- ##
============================================
+ Coverage 63.24% 63.30% +0.06%
Complexity 1474 1474
============================================
Files 3147 3162 +15
Lines 187490 188631 +1141
Branches 28703 28868 +165
============================================
+ Hits 118575 119417 +842
- Misses 59728 59958 +230
- Partials 9187 9256 +69
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a drain mode for Pinot Minion instances to enable graceful scale-down operations. The implementation allows operators to prevent new task assignments to specific minions while existing tasks complete, using Helix's tag-based assignment mechanism.
Key Changes:
- Added
DRAINstate support to the instance state API endpoint - Implemented tag switching mechanism (minion_untagged → minion_drained) to prevent new task assignments
- Added comprehensive test coverage for drain operations including edge cases and concurrency scenarios
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| CommonConstants.java | Added DRAINED_MINION_INSTANCE constant for the new drain tag |
| PinotHelixResourceManagerMinionDrainTest.java | Comprehensive unit tests covering basic drain operations, edge cases (multiple tags, empty tags, already drained), and concurrent safety |
| PinotInstanceRestletResourceTest.java | Integration tests verifying API behavior for drain operations including error cases |
| PinotHelixResourceManager.java | Core drain logic with drainMinionInstance() method and helper for tag replacement |
| StateType.java | Extended enum to include DRAIN state |
| PinotInstanceRestletResource.java | API endpoint updates to handle DRAIN state with validation for minion instances only |
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
Outdated
Show resolved
Hide resolved
...ntroller/src/test/java/org/apache/pinot/controller/api/PinotInstanceRestletResourceTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
| * @param instanceName Name of the minion instance to drain | ||
| * @return Response indicating success or failure | ||
| */ | ||
| public synchronized PinotResourceManagerResponse drainMinionInstance(String instanceName) { |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The synchronized method-level lock could become a bottleneck under concurrent drain requests across different minion instances. Consider using instance-level locking or a more granular synchronization mechanism (e.g., ConcurrentHashMap with per-instance locks) to allow concurrent draining of different minions.
| @ApiOperation(value = "Enable/disable an instance", notes = "Enable/disable an instance") | ||
| @ApiOperation(value = "Enable/disable/drain an instance", notes = "Enable/disable/drain an instance. " | ||
| + "DRAIN state is only applicable to minion instances and retags them from minion_untagged to minion_drained, " | ||
| + "preventing new task assignments while allowing existing tasks to complete.") |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation should mention that the drain operation is idempotent and preserves other tags on the instance. This is important operational context for users of the API.
| + "preventing new task assignments while allowing existing tasks to complete.") | |
| + "preventing new task assignments while allowing existing tasks to complete. " | |
| + "The drain operation is idempotent and preserves other tags on the instance.") |
| @Consumes(MediaType.TEXT_PLAIN) | ||
| @ApiOperation(value = "Enable/disable an instance", notes = "Enable/disable an instance") | ||
| @ApiOperation(value = "Enable/disable/drain an instance", notes = "Enable/disable/drain an instance. " | ||
| + "DRAIN state is only applicable to minion instances and retags them from minion_untagged to minion_drained, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean if a minion is tagged to something else, then we cannot drain this node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this scenario, yes, because ideally, if the minion is tagged with something else, then there are no tasks that are currently being assigned to it, unless the table config has minionInstanceTag present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiangfu0 noticed that we can custom-tag minions so that specific tasks get executed on those. So replacing all the tags with minion_drained
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java
Outdated
Show resolved
Hide resolved
| "DRAIN state only applies to minion instances. Instance '" + instanceName + "' is not a minion.", | ||
| Response.Status.BAD_REQUEST); | ||
| } | ||
| PinotResourceManagerResponse response = _pinotHelixResourceManager.drainMinionInstance(instanceName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a minion is in drained mode, when the same minion come up again, the tag is still drained, then it won't be able to serve tasks right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct. In case a minion is in drained mode, you would need to retag them with minion_untagged so that Helix can start assigning tasks. This is only when the minion is drained, but not removed from the controller
Implements a drain mode for Pinot Minion instances, allowing operators to gracefully scale down minions by preventing new task assignments while allowing existing tasks to complete.
PUT /instances/{instanceName}/stateto supportDRAINstate for minion instancesminion_untagged→minion_drained, preventing Helix from assigning new tasks.setInstanceGroupTag())