task(vmop): add migration progress status and reasons#2182
Draft
task(vmop): add migration progress status and reasons#2182
Conversation
c478cf2 to
b884608
Compare
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Handle nil migrations as generic failures and keep terminal VMOP metrics detection explicit. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Read migration byte counters from VirtualMachineInstanceMigration status and feed them into VMOP migration progress calculation. Keep degraded mode as a fallback when counters are absent and use a local kubevirt.io/api replace for the patched 3p-kubevirt module during development. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Keep degraded-mode stall bump monotonic, but avoid issuing another artificial increment when the computed base progress only trails the previous value by the prior bump. This removes the near +1-per-reconcile behavior while preserving the fallback path for migrations without byte counters. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
…yncs Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
…ss store Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
…and adaptive stall Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
- BuildRecord now reads AutoConverge from MigrationConfiguration.AllowAutoConverge instead of vmop.Spec.Force; remove resolveAutoConverge helper - isAtMaxThrottle: !AutoConverge => always at max (safe mode), AutoConverge => throttle >= 0.99 - Add live TargetDiskError detection via target pod events (FailedAttachVolume/FailedMount) - Preserve NotConverging terminal reason when migration fails with generic reason - Add unit tests for IsNotConverging, BuildRecord AutoConverge, and integration tests for TargetPreparing, TargetResumed, SourceSuspended, NotConverging persistence, TargetDiskError live detection Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
kubelet emits FailedAttachVolume/FailedMount events while pod is in Pending phase — before ContainerCreating state is ever reached. The previous ContainerCreating guard prevented detection entirely. Now check events for any Pending pod that is not being deleted. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
This reverts commit 56de1b8. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
a410aa5 to
5619cf3
Compare
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Implement VMOP migration progress reporting and make sync-phase progress more informative and stable.
Changes in this PR:
status.progresssupport for migrationVirtualMachineOperationobjects;Completedreasons for migration lifecycle stages;migrationprogress.SyncRangeMin/SyncRangeMax;status.migrationState.transferStatuswhen they are available;Why do we need it, and what problem does it solve?
Migration VMOP progress was previously coarse and bursty. In practice users could observe sequences like
1 -> 2 -> 3 -> 16 -> 41 -> 42..51 -> 100, where the visible progress was driven by reconcile cadence and fallback estimation rather than by the actual migration state.This PR improves the user-facing progress model by:
transferStatuswhen they are available;+1 per reconcilebehavior from the stall bump logic.As a result, migration progress becomes more actionable for users and easier to debug from VMOP conditions.
This PR depends on the matching
3p-kubevirtAPI change that moves migration transfer-related fields underVirtualMachineInstanceMigrationState.transferStatus.What is the expected result?
VirtualMachineOperation.status.progresson the VMOP object.status.migrationState.transferStatus, sync progress should use those runtime values.Completedcondition reasons should reflect the current migration stage.Checklist
Changelog entries