-
Notifications
You must be signed in to change notification settings - Fork 80
[fm] resource deletion: tombstones, version guard, tracker-based GC #10368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3727,6 +3727,7 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| alerts, | ||
| support_bundles, | ||
| ereport_marking: marking, | ||
| latest_processed_sitrep_version, | ||
| } = match serde_json::from_value::<FmRendezvousStatus>(details.clone()) { | ||
| Err(error) => { | ||
| eprintln!( | ||
|
|
@@ -3738,7 +3739,36 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| Ok(status) => status, | ||
| }; | ||
| match sitrep_id { | ||
| Some(id) => println!(" current sitrep: {id}"), | ||
| Some(id) => { | ||
| println!(" current sitrep: {id}"); | ||
| match latest_processed_sitrep_version { | ||
| Some(v) => { | ||
| // We show this line when `fm_rendezvous_progress_advance` | ||
| // succeeded, meaning that either the sitrep successfully | ||
| // executed, or it was stale and the work was skipped. | ||
| println!(" latest fully-processed sitrep version: {v}",) | ||
| } | ||
| None => { | ||
| // Disambiguate: a `None` here means either the advance | ||
| // was skipped because at least one subtask recorded | ||
| // per-request errors, or the advance call itself failed. | ||
| // Both cases are also logged separately by Nexus. | ||
| let any_subtask_errors = !alerts.details.errors.is_empty() | ||
| || !support_bundles.details.errors.is_empty() | ||
| || !marking.details.errors.is_empty(); | ||
| if any_subtask_errors { | ||
| println!( | ||
| "(i) tracker advance skipped: subtasks recorded \ | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mentioned this below, but I find the term "tracker" vague, and hard to parse here without knowing the precise context of this PR: Is it tracking the most recent sitrep? The sitrep we tried to execute? The sitrep which fully executed? That nuance matters, and "tracker" omits it |
||
| per-request errors" | ||
| ); | ||
| } else { | ||
| println!( | ||
| "(i) tracker advance failed at this activation" | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| None => println!( | ||
| "(i) no FM situation report loaded, so rendezvous was not \ | ||
| performed", | ||
|
|
@@ -3751,20 +3781,25 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| total_alerts_requested, | ||
| current_sitrep_alerts_requested, | ||
| alerts_created, | ||
| stale_sitrep, | ||
| errors, | ||
| }| { | ||
| let already_created = | ||
| total_alerts_requested - alerts_created - errors.len(); | ||
| let already_created = total_alerts_requested | ||
| - alerts_created | ||
| - stale_sitrep | ||
| - errors.len(); | ||
| const REQUESTED: &str = "alerts requested:"; | ||
| const REQUESTED_THIS_SITREP: &str = " requested in this sitrep:"; | ||
| const CREATED: &str = " created in this activation:"; | ||
| const ALREADY_CREATED: &str = " already created:"; | ||
| const STALE_SITREP: &str = " skipped (stale sitrep):"; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'm not totally sure what "skipped (stale sitrep)" means --- is this referring to the count of requested entities which were not created because they were already deleted in a future sitrep? it would be nice if this was a bit clearer about its meaning... |
||
| const ERRORS: &str = " errors:"; | ||
| const WIDTH: usize = const_max_len(&[ | ||
| REQUESTED, | ||
| REQUESTED_THIS_SITREP, | ||
| CREATED, | ||
| ALREADY_CREATED, | ||
| STALE_SITREP, | ||
| ERRORS, | ||
| ]) + 1; | ||
| pub const NUM_WIDTH: usize = 4; | ||
|
|
@@ -3779,6 +3814,7 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| println!( | ||
| " {ALREADY_CREATED:<WIDTH$}{already_created:>NUM_WIDTH$}" | ||
| ); | ||
| println!(" {STALE_SITREP:<WIDTH$}{stale_sitrep:>NUM_WIDTH$}"); | ||
| println!( | ||
| "{} {ERRORS:<WIDTH$}{:>NUM_WIDTH$}", | ||
| warn_if_nonzero(errors.len()), | ||
|
|
@@ -3796,20 +3832,25 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| total_bundles_requested, | ||
| current_sitrep_bundles_requested, | ||
| bundles_created, | ||
| stale_sitrep, | ||
| errors, | ||
| }| { | ||
| let already_created = | ||
| total_bundles_requested - bundles_created - errors.len(); | ||
| let already_created = total_bundles_requested | ||
| - bundles_created | ||
| - stale_sitrep | ||
| - errors.len(); | ||
| const REQUESTED: &str = "support bundles requested:"; | ||
| const REQUESTED_THIS_SITREP: &str = " requested in this sitrep:"; | ||
| const CREATED: &str = " created in this activation:"; | ||
| const ALREADY_CREATED: &str = " already created:"; | ||
| const STALE_SITREP: &str = " skipped (stale sitrep):"; | ||
| const ERRORS: &str = " errors:"; | ||
| const WIDTH: usize = const_max_len(&[ | ||
| REQUESTED, | ||
| REQUESTED_THIS_SITREP, | ||
| CREATED, | ||
| ALREADY_CREATED, | ||
| STALE_SITREP, | ||
| ERRORS, | ||
| ]) + 1; | ||
| pub const NUM_WIDTH: usize = 4; | ||
|
|
@@ -3824,6 +3865,7 @@ fn print_task_fm_rendezvous(details: &serde_json::Value) { | |
| println!( | ||
| " {ALREADY_CREATED:<WIDTH$}{already_created:>NUM_WIDTH$}" | ||
| ); | ||
| println!(" {STALE_SITREP:<WIDTH$}{stale_sitrep:>NUM_WIDTH$}"); | ||
| println!( | ||
| "{} {ERRORS:<WIDTH$}{:>NUM_WIDTH$}", | ||
| warn_if_nonzero(errors.len()), | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -48,6 +48,12 @@ pub struct Alert { | |
|
|
||
| /// The ID of the fault management case that created this alert, if any. | ||
| pub case_id: Option<DbTypedUuid<CaseKind>>, | ||
|
|
||
| /// Tombstone timestamp: set when an FM-created alert is soft-deleted. | ||
| /// | ||
| /// Only meaningful when `case_id` is `Some` (FM-created alerts). | ||
| /// Non-FM alerts continue to hard-delete and never have this set. | ||
| pub time_deleted: Option<DateTime<Utc>>, | ||
| } | ||
|
Comment on lines
+54
to
57
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not convinced that this is true (or necessary?), since we don't really have any other code paths that delete alerts at this time. We might end up also having those code paths soft-delete alert records as well. I don't think it's necessarily worth designing for both soft- and hard-deletes depending on where the alert came from at this point? |
||
|
|
||
| impl Alert { | ||
|
|
@@ -69,6 +75,7 @@ impl Alert { | |
| payload: payload.into(), | ||
| num_dispatched: 0, | ||
| case_id: None, | ||
| time_deleted: None, | ||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small note here: i notice that here we are printing the current sitrep's ID, and the version of the last fully executed sitrep...this is a bit unhelpful, as the reader cannot tell whether or not those refer to the same sitrep without having to run more commands to figure out which sitrep the latest executed version is. what do you think about maybe changing the
FmRendezvousStatustype to include both the current sitrep's ID and version, so that we can print something like: