Skip to content

test failed in CI: migration_update_can_complete_with_dead_switch #10439

@bnaecker

Description

@bnaecker

This test failed on a CI run on #10318:

https://github.com/oxidecomputer/omicron/pull/10318/checks?check_run_id=75693992927

Log showing the specific test failure:

https://buildomat.eng.oxide.computer/wg/0/details/01KRFCQNVN1194YRYR14GCH0RB/pnpg9UJFAwdOraQiImLgVrbwRCmD4SpXpDZuYlhKo9ebeR6S/01KRFCR6QJ2QSXXBGXF8RVPMVZ#S8532

Excerpt from the log showing the failure:

TRY 1 FAIL [  29.877s] (─────────) omicron-nexus app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch
dout ───

running 1 test
sled 7112972d-accf-460a-8811-e48dfe614634 successfully installed routes ResolvedVpcRouteSet { id: RouterId { vni: Vni(1350137), kind: System }, version: Some(RouterVersion { router_id: ca8971e2-27b1-4bc2-a0b5-c3a140da10ef, version: 3 }), routes: {ResolvedVpcRoute { dest: V4(Ipv4Net { addr: 172.30.0.0, width: 22 }), target: VpcSubnet(V4(Ipv4Net { addr: 172.30.0.0, width: 22 })) }, ResolvedVpcRoute { dest: V6(Ipv6Net { addr: ::, width: 0 }), target: InternetGateway(Instance(b6181eeb-ec32-4d3a-87a2-3e5b807122e8)) }, ResolvedVpcRoute { dest: V6(Ipv6Net { addr: fdcb:304d:3cf4::, width: 64 }), target: VpcSubnet(V6(Ipv6Net { addr: fdcb:304d:3cf4::, width: 64 })) }, ResolvedVpcRoute { dest: V4(Ipv4Net { addr: 0.0.0.0, width: 0 }), target: InternetGateway(Instance(b6181eeb-ec32-4d3a-87a2-3e5b807122e8)) }} }
sled 7112972d-accf-460a-8811-e48dfe614634 successfully installed routes ResolvedVpcRouteSet { id: RouterId { vni: Vni(1350137), kind: Custom(V4(Ipv4Net { addr: 172.30.0.0, width: 22 })) }, version: None, routes: {} }
test app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch ... FAILED

failures:

failures:
    app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 276 filtered out; finished in 29.40s

derr ───
log file: /var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.0.log
note: configured to log to "/var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.0.log"
DB URL: postgresql://root@[::1]:46248/omicron?sslmode=disable
DB address: [::1]:46248
log file: /var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.2.log
note: configured to log to "/var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.2.log"
log file: /var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.3.log
note: configured to log to "/var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.3.log"

thread 'app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch' (2) panicked at nexus/src/app/sagas/instance_update/mod.rs:2521:9:
assertion failed: switch0_dpd_client.dpd_uptime().await.is_err()
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/std/src/panicking.rs:689:5
   1: core::panicking::panic_fmt
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/panicking.rs:80:14
   2: core::panicking::panic
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/panicking.rs:150:5
   3: {async_block#0}
             at ./src/app/sagas/instance_update/mod.rs:2521:9
   4: poll<&mut dyn core::future::future::Future<Output=()>>
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/future/future.rs:133:9
   5: poll<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/future/future.rs:133:9
   6: {closure#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:778:70
   7: with_budget<core::task::poll::Poll<()>, tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/task/coop/mod.rs:167:5
   8: budget<core::task::poll::Poll<()>, tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/task/coop/mod.rs:133:5
   9: {closure#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:778:25
  10: <tokio::runtime::scheduler::current_thread::Context>::enter::<core::task::poll::Poll<()>, <tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}::{closure#0}>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:451:19
  11: {closure#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:777:44
  12: <tokio::runtime::scheduler::current_thread::CoreGuard>::enter::<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>::{closure#0}
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:865:68
  13: <tokio::runtime::context::scoped::Scoped<tokio::runtime::scheduler::Context>>::set::<<tokio::runtime::scheduler::current_thread::CoreGuard>::enter<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>::{closure#0}, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core>, core::option::Option<()>)>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/context/scoped.rs:40:9
  14: tokio::runtime::context::set_scheduler::<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core>, core::option::Option<()>), <tokio::runtime::scheduler::current_thread::CoreGuard>::enter<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>::{closure#0}>::{closure#0}
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/context.rs:181:38
  15: try_with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<()>), tokio::runtime::scheduler::current_thread::{impl#9}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#9}::block_on::{closure_env#0}<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>, core::option::Option<()>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<()>)>
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/std/src/thread/local.rs:513:12
  16: <std::thread::local::LocalKey<tokio::runtime::context::Context>>::with::<tokio::runtime::context::set_scheduler<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core>, core::option::Option<()>), <tokio::runtime::scheduler::current_thread::CoreGuard>::enter<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>::{closure#0}>::{closure#0}, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core>, core::option::Option<()>)>
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/std/src/thread/local.rs:477:20
  17: tokio::runtime::context::set_scheduler::<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core>, core::option::Option<()>), <tokio::runtime::scheduler::current_thread::CoreGuard>::enter<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>::{closure#0}>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/context.rs:181:17
  18: <tokio::runtime::scheduler::current_thread::CoreGuard>::enter::<<tokio::runtime::scheduler::current_thread::CoreGuard>::block_on<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>::{closure#0}, core::option::Option<()>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:865:27
  19: <tokio::runtime::scheduler::current_thread::CoreGuard>::block_on::<core::pin::Pin<&mut core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:765:24
  20: {closure#0}<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:205:33
  21: tokio::runtime::context::runtime::enter_runtime::<<tokio::runtime::scheduler::current_thread::CurrentThread>::block_on<core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>::{closure#0}, ()>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/context/runtime.rs:65:16
  22: block_on<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/scheduler/current_thread/mod.rs:193:9
  23: <tokio::runtime::runtime::Runtime>::block_on_inner::<core::pin::Pin<&mut dyn core::future::future::Future<Output = ()>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/runtime.rs:371:52
  24: block_on<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
             at /home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.1/src/runtime/runtime.rs:345:18
  25: migration_update_can_complete_with_dead_switch
             at ./src/app/sagas/instance_update/mod.rs:2570:35
  26: omicron_nexus::app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch::{closure#0}
             at ./src/app/sagas/instance_update/mod.rs:2464:62
  27: <omicron_nexus::app::sagas::instance_update::test::migration_update_can_complete_with_dead_switch::{closure#0} as core::ops::function::FnOnce<()>>::call_once
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/ops/function.rs:250:5
  28: core::ops::function::FnOnce::call_once
             at /rustc/4a4ef493e3a1488c6e321570238084b38948f6db/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
WARN: dropped CockroachInstance without cleaning it up first (there may still be a child process running and a temporary directory leaked)
WARN: temporary directory leaked: "/var/tmp/omicron_tmp/.tmpb4ZTfB"
	If you would like to access the database for debugging, run the following:

	# Run the database
	cargo xtask db-dev run --no-populate --store-dir "/var/tmp/omicron_tmp/.tmpb4ZTfB/data"
	# Access the database. Note the port may change if you run multiple databases.
	cockroach sql --host=localhost:32221 --insecure
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 16850) and a temporary directory leaked, /var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.1-clickhouse-QTmFTd)
failed to clean up ClickHouse data dir:
- /var/tmp/omicron_tmp/omicron_nexus-6343daf2d8657143-migration_update_can_complete_with_dead_switch.16843.1-clickhouse-QTmFTd: File exists (os error 17)
WARN: dropped DendriteInstance without cleaning it up first (there may still be a child process running and a temporary directory leaked)
WARN: dendrite temporary directory leaked: /var/tmp/omicron_tmp/.tmpWxcwmc
WARN: dropped MgdInstance without cleaning it up first (there may still be a child process running and a temporary directory leaked)
WARN: mgd temporary directory leaked: /var/tmp/omicron_tmp/.tmp5VV1qL
WARN: dropped MgdInstance without cleaning it up first (there may still be a child process running and a temporary directory leaked)
WARN: mgd temporary directory leaked: /var/tmp/omicron_tmp/.tmpIfENMA

Metadata

Metadata

Assignees

No one assigned

    Labels

    Test FlakeTests that work. Wait, no. Actually yes. Hang on. Something is broken.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions