Explain the details to quota compensation#981
Conversation
88b648c to
8c379ba
Compare
|
8c379ba to
f8c3ed2
Compare
Signed-off-by: Jian Wang <jian.wang@suse.com>
|
The doc PR is rebased with main head, to align with code PR harvester/harvester#10197 and parent issue harvester/harvester#9753 |
There was a problem hiding this comment.
Pull request overview
Adds documentation explaining Harvester’s new “quota compensation” behavior during VM live migration when VM memory overhead settings are changed while VMs are already running, clarifying why migrations can be blocked by namespace ResourceQuota and how Harvester unblocks them.
Changes:
- Documented the migration blockage scenario caused by increased VM memory footprint after overhead changes.
- Described Harvester’s automatic detection, temporary quota compensation, and cleanup workflow.
- Added operational notes on limitations (only applies to already-running VMs) and best practices.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| _Available as of v1.8.0_ | ||
|
|
||
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. |
|
|
||
| _Available as of v1.8.0_ | ||
|
|
||
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. |
|
|
||
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. | ||
|
|
||
| When the destination Namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. |
|
|
||
| **Detection**: Harvester identifies when a migration is specifically blocked by ResourceQuota limitations. | ||
|
|
||
| **Delta Compensation**: The system automatically injects a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. |
| When the destination Namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. | ||
|
|
||
| ### Automatic Resolution | ||
|
|
||
| Harvester automatically manages this bottleneck through the following workflow: | ||
|
|
||
| **Detection**: Harvester identifies when a migration is specifically blocked by ResourceQuota limitations. | ||
|
|
||
| **Delta Compensation**: The system automatically injects a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. | ||
|
|
||
| **Cleanup**: Once the migration is complete (whether it succeeded or failed), Harvester removes the temporary compensation, returning the ResourceQuota to its original state. | ||
|
|
||
| :::info important | ||
|
|
||
| - This feature acts as an automatic workaround, ensuring that global policy changes (like memory overhead adjustments) do not accidentally "lock" your VMs to their current nodes. | ||
|
|
||
| - The compensation applies only to already running VMs. If a running VM is stopped and then restarted, its resource allocation is strictly governed by the original quota. As a result, a VM that was previously running may fail to "cold boot" if the overhead is increased while the ResourceQuota remains unchanged. | ||
|
|
||
| - To avoid reliance on this automatic compensation, the best practice is to adjust overhead settings and ResourceQuotas simultaneously. Ultimately, there is no difference between a live migration and a "cold reboot" regarding final quota control; both must eventually fit within the defined namespace limits. |
|
|
||
| _Available as of v1.8.0_ | ||
|
|
||
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. |
There was a problem hiding this comment.
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. | |
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during live migration, as the target node must instantiate a new VM instance that requires more memory than the original. |
|
|
||
| If you adjust the [VM Overhead Memory](#overhead-memory-of-virtual-machine) cluster settings while a Virtual Machine (VM) is already running, the VM's memory footprint may increase. This can cause issues during Live Migration, as the target node must instantiate a new VM instance that requires more memory than the original. | ||
|
|
||
| When the destination Namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. |
There was a problem hiding this comment.
| When the destination Namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. | |
| When the destination namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. |
|
|
||
| ::: | ||
|
|
||
| ## ResourceQuota Compensation During Migration |
There was a problem hiding this comment.
| ## ResourceQuota Compensation During Migration | |
| ## ResourceQuota Automatic Adjustment During Migration |
here and elsewhere, consider using "adjustment" instead of "compensation"; the latter is more commonly used in the context of employment, insurance, lawsuit etc.
|
|
||
| When the destination Namespace has a strict ResourceQuota, the migration may be blocked because the cumulative memory usage of both the source and target VMs temporarily exceeds the quota even when the quota has already been scaled up automatically. | ||
|
|
||
| ### Automatic Resolution |
There was a problem hiding this comment.
| ### Automatic Resolution |
probably don't need this title if you just add the word "Automatic" to the main heading as suggested above
|
|
||
| **Detection**: Harvester identifies when a migration is specifically blocked by ResourceQuota limitations. | ||
|
|
||
| **Delta Compensation**: The system automatically injects a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. |
There was a problem hiding this comment.
| **Delta Compensation**: The system automatically injects a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. | |
| **Delta Compensation**: The system automatically calculates a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. |
|
|
||
| **Delta Compensation**: The system automatically injects a temporary "quota compensation" (the delta between the current limit and current usage plus the migration requirements) to allow the migration to proceed. | ||
|
|
||
| **Cleanup**: Once the migration is complete (whether it succeeded or failed), Harvester removes the temporary compensation, returning the ResourceQuota to its original state. |
There was a problem hiding this comment.
| **Cleanup**: Once the migration is complete (whether it succeeded or failed), Harvester removes the temporary compensation, returning the ResourceQuota to its original state. | |
| **Cleanup**: Once the migration is completed (whether it succeeded or failed), Harvester removes the temporary compensation, returning the ResourceQuota to its original state. |
|
|
||
| :::info important | ||
|
|
||
| - This feature acts as an automatic workaround, ensuring that global policy changes (like memory overhead adjustments) do not accidentally "lock" your VMs to their current nodes. |
There was a problem hiding this comment.
| - This feature acts as an automatic workaround, ensuring that global policy changes (like memory overhead adjustments) do not accidentally "lock" your VMs to their current nodes. | |
| - This automated feature prevents global policy changes like memory overhead adjustments from accidentally blocking VMs live migration. |
|
|
||
| - This feature acts as an automatic workaround, ensuring that global policy changes (like memory overhead adjustments) do not accidentally "lock" your VMs to their current nodes. | ||
|
|
||
| - The compensation applies only to already running VMs. If a running VM is stopped and then restarted, its resource allocation is strictly governed by the original quota. As a result, a VM that was previously running may fail to "cold boot" if the overhead is increased while the ResourceQuota remains unchanged. |
There was a problem hiding this comment.
| - The compensation applies only to already running VMs. If a running VM is stopped and then restarted, its resource allocation is strictly governed by the original quota. As a result, a VM that was previously running may fail to "cold boot" if the overhead is increased while the ResourceQuota remains unchanged. | |
| - The compensation applies only to already running VMs. If a running VM is stopped and then restarted, no automatic quota adjustment will be triggered. As a result, the VM may fail to start if its overhead requirement exceeds the associated ResourceQuota. |
Problem:
Solution:
Explain the details to quota compensation
Related Issue(s):
harvester/harvester#9753
harvester/harvester#9942
Test plan:
Additional documentation or context