Skip to content

Nav pt4: Fix tranform frames#2112

Draft
jeff-hykin wants to merge 63 commits into
mainfrom
jeff/clean/nav3
Draft

Nav pt4: Fix tranform frames#2112
jeff-hykin wants to merge 63 commits into
mainfrom
jeff/clean/nav3

Conversation

@jeff-hykin
Copy link
Copy Markdown
Member

(Will fill out later)

Problem

Closes DIM-XXX

Solution

How to Test

Contributor License Agreement

  • I have read and approved the CLA.

jeff-hykin added 30 commits May 12, 2026 07:48
… rrb

Native module (cpp/main.cpp) now publishes two new streams on every
keyframe: GraphNodes3D for keyframe optimized poses, LineSegments3D for
odometry (traversability=1.0) and loop-closure (0.4) edges. Both wire
through SimplePGO::keyPoses() + historyPairs() — no changes needed to
simple_pgo.{h,cpp} since the accessors already exist. Native binary
rebuilt cleanly via nix build .#default --no-write-lock-file.

Python (pgo.py) declares matching pgo_graph_nodes / pgo_graph_edges Out
streams so the rerun bridge auto-discovers and logs them.

nav_stack_rerun_config() now picks _agentic_debug_rerun_blueprint when
agentic_debug=True — an rrb.Horizontal layout with a 3D pane and a
dedicated top-down pane (both Spatial3DView over origin="world", named
"3D" and "top_down" so dimos-viewer persists camera state separately).

demo_better_pgo_viz.py composes the cross-wall sim blueprint with
agentic_debug=True so the new layout + pose graph render together. Used
for manual screenshot validation.
Adds visual_override entries for world/pgo_graph_nodes and
world/pgo_graph_edges that mirror the existing FAR pattern: when
agentic_debug=True, the PGO pose graph renders at z=_AGENTIC_DEBUG_LIFT
(3.0m) instead of the default 1.7m, with slightly larger node radii
(0.15) and edge thickness (0.06) so the green keyframe trajectory
stands out clearly above the terrain cloud in the top-down pane.

Verified visually via demo_better_pgo_viz with the cross-wall sim —
green keyframe nodes + edges are now plainly identifiable above
terrain in both the 3D and top_down rerun panels.
rerun's Spatial3DView doesn't have a top-down camera API, so the
"top_down" pane introduced in a7a9be9 was just a duplicate 3D view.
Drop _agentic_debug_rerun_blueprint and use _default_rerun_blueprint
unconditionally — the agentic_debug lift on visual_override is what
actually makes the pose graph and nav markers readable from any angle.
C++ side (main.cpp): when searchForLoopPairs sets m_cache_pairs (i.e.
this keyframe will be incorporated into iSAM2 with a loop factor),
snapshot the current global poses before smoothAndUpdate. After the
update, build a nav_msgs::Path-encoded LoopClosureDeltas message:
position = post.t - r_delta * pre.t, orientation = quaternion(post.R *
pre.R^T). Publish on the new pgo_loop_closure topic. Stderr logs the
event count for live observability.

Python side (pgo.py): declare pgo_loop_closure: Out[NavPath] so the
new topic is registered alongside corrected_odometry/pgo_tf/etc.

Slow test (test_pgo_loop_closure.py): replays og_nav_60s through the
native binary with permissive thresholds (loop_time_thresh=5s,
min_loop_detect_duration=1s, loop_search_radius=2m,
loop_score_thresh=0.5) so the recording reliably triggers loop
closures. Subscribes to pgo_loop_closure, logs each event the moment
it arrives (event #, poses_length, frame_id, first delta), and after
the run validates each event has >0 poses, finite translations
(<100m), and unit-norm quaternions (drift <0.05). Stdout from a run
shows 19 events, sizes 10..35, max |t|=0.0013m, max |q|-1|=1e-6 —
exactly the small-nudge profile expected from a self-consistent
recording.
Replaces the kdtree-on-keyframe-positions loop search with a Scan
Context (Kim & Kim 2018) descriptor-based pipeline:

  1. addKeyPose now also caches a polar-binned (20 rings × 60 sectors)
     max-z descriptor + the per-row mean "ring key" for each keyframe.
     The descriptor is appearance-based and pose-independent, so it
     keeps working even when odometry has drifted enough that the new
     keyframe is no longer "near" its old neighbours in pose-space.

  2. searchForLoopPairs first asks Scan Context for a candidate:
     ring-key L2 distance ranks all past keyframes, top-K are scored
     by column-shifted cosine distance on the full descriptor, the
     best below the threshold (default 0.4) is the candidate. The
     winning column shift is also converted to a yaw rotation and used
     to seed ICP, which dramatically improves convergence on revisits
     that arrive at a different heading from the original pass.

  3. Position-based search is retained as a fallback when SC is
     disabled or finds nothing, so existing behaviour is preserved.

Replaces ~50 lines of position-search with ~30 lines of SC retrieval
in searchForLoopPairs; new scan_context.{h,cpp} (~150 lines, MIT
attribution to upstream irapkaist/scancontext concepts but no source
copied) implements the descriptor + distance.

Side-effect: this makes on-start relocalization a small follow-up
addition — descriptors + ring-keys + poses are now per-keyframe state
that can be serialised, and the SC search path already does
"appearance-based pose recovery without an initial pose guess."

Verified via test_pgo_loop_closure.py: 17 loop-closure events fired
across the og_nav_60s rosbag (was 19 with naive position search; SC
is more selective and rejects two borderline-position matches that
weren't actually visual revisits). All events have valid shape + tiny
quaternion/translation deltas as expected for a self-consistent bag.
…n search misses

Adds CLI args to expose Scan Context config on the native binary
(--use_scan_context, --sc_n_rings, --sc_n_sectors, --sc_max_range_m,
--sc_top_k, --sc_match_threshold).

New slow test test_pgo_synthetic_drift.py:
- Synthesises a 4-wall point-cloud room with two distinctive interior
  columns (so the scene isn't rotationally symmetric).
- Generates an out-and-back trajectory: drives east 8m then returns
  to the origin, heading unchanged.
- Injects DRIFT_AT_REVISIT_M = 5m of additive y-drift into the
  reported odometry, ramped linearly with travelled distance. The
  body-frame scan stays byte-identical between first and second visit
  (same true sensor view of the same scene); the odom pose at revisit
  is 5m offset.
- Runs the native PGO binary twice over the same input:
  * use_scan_context=true  → expect ≥1 loop event
  * use_scan_context=false → expect 0 loop events (drift >> 1m radius)
- Dumps PGO stderr after each run for diagnostics.

Result: SC fires 10 loop closure events on the synthetic trajectory;
position-based search fires 0 — exactly the demonstration of why we
swapped to appearance-based place recognition. Both assertions pass.

Verifies the core SC value prop: appearance-based place recognition
doesn't depend on the (drifted) pose, so it keeps working when the
odometry has wandered far enough that the kdtree-on-positions search
no longer finds neighbours.
Test files now use setup_logger() / logger.info(...) per the
fix_nits rule "no print() calls in tests; use logging if diagnostics
are genuinely needed." Matches the existing test_pgo_rosbag.py
convention. Also drops the now-unused sys import.

Also clears a stale docstring on demo_better_pgo_viz.py: it claimed
the demo enabled a "horizontal 3D + top-down panes" layout, which was
reverted in 1801759 — rerun's Spatial3DView didn't support an
initial camera angle (rrb.EyeControls3D existed at the time but
wasn't used). The remaining value of agentic_debug=True is the visual
override lift, which the new docstring describes accurately.

No behavioural change. Tests still pass.
Sweep over names introduced by the better_pgo work that hit fix_nits
"expand mod -> module" rule:

- scan_context: cfg -> config (param + 12 call-sites); d (return val) ->
  descriptor in make_descriptor/make_ring_key/make_sector_key; pt -> point
  in the descriptor build loop; zf -> point_z (float cast); q_col/c_col
  -> query_column/candidate_column; q_norm/c_norm -> query_norm/
  candidate_norm; cj -> shifted_j; d (in best_distance return loop) ->
  distance with min_distance for the running best.

- simple_pgo: desc -> descriptor on the per-keyframe cache; k ->
  top_k_count for the partial-sort bound; structured-binding `auto [d,
  shift]` -> `auto [distance, shift]`.

- main.cpp: kp -> keyframe; ps -> pose_stamped (build_graph_nodes and
  build_loop_closure_deltas); a/b -> start/end and p1/p2 ->
  start_pose/end_pose in append_segment; n -> count for the loop bound;
  lc_msg -> loop_closure_msg at the publish site.

- tests: ps -> pose in the validate loop (test_pgo_loop_closure);
  c,s -> cos_yaw,sin_yaw in _yaw_rotation (test_pgo_synthetic_drift).

Names that intentionally stay short are the math-convention ones:
r/t for SE(3) rotation+translation, q for quaternion, i/j as loop
indices, idx as keyframe index, ts as timestamp, dt for time delta,
tx/ty/tz/qx/qy/qz/qw for component decomposition. The fix_nits rule
calls out mod/lc as the target pattern; expanding the math-notation
names would make the code less readable, not more.

Also drops one section-label comment ("# Log each event the moment it
arrives.") whose adjacent function name already conveys the same and
one in-loop "# node_type 1 = odom/robot" that repeats info already
stated in the function-level docstring.

Native binary rebuilt + slow test still passes (17 events, all valid).
Drops in the wiring for evaluating the PGO native module on KITTI-360.
Cannot run end-to-end yet — the dataset is gated behind a registered
login at cvlibs.net so the data download is a manual user step.

What's here:
- kitti360_loader.py: parses the KITTI-360 directory layout (data_3d_raw
  + data_poses + calibration); composes per-frame lidar→world pose by
  chaining cam0_to_world ⊕ inv(velo_to_cam). Exposes a frame iterator
  + scan_xyz(frame_id).
- loop_groundtruth.py: LCDNet/KITTI-convention groundtruth (≥50 frame
  gap, ≤4m radius), order-agnostic scoring of detected pairs.
- run_kitti360_benchmark.py: argparse CLI, spawns the native binary on
  private LCM topics, plays (registered_scan, odometry) from disk,
  subscribes to pgo_graph_edges to extract loop-closure pairs (via
  traversability ≈ 0.4 segments) and pgo_loop_closure for delta event
  counts. Writes JSON.
- README.md: download instructions for the official "Test SLAM 3D"
  12 GB package, published SOTA reference numbers from LCDNet + ISC
  papers (LCDNet 0.91-0.93 AP, Scan Context 0.62-0.78 AP), expected
  ballpark for our minimal SC port.
jeff-hykin and others added 18 commits May 15, 2026 20:56
mypy can't infer parameter types on a lambda subscribed to LCM; lift
the body into a tiny factory function with explicit
Callable[[str, bytes], None] signature so the lint job passes.
sklearn doesn't ship a py.typed marker; the new place_recognition_ap
benchmark is the only sklearn user in dimos, so a per-import ignore is
the smallest fix to unstick the lint job.
…ark)

Brings in better_pgo's PGO improvements onto the TF-rework branch:

- pose graph output renames: --pgo_graph_nodes/edges/loop_closure ->
  --pose_graph_nodes/edges, --loop_closure (matching port names too).
- Scan-context place recognition for loop closure (scan_context.{h,cpp}).
- New benchmark/eval infrastructure under nav_stack/benchmarks/
  pose_graph_kitti360/ + pgo/benchmark_*.py.
- C++ nits: mod -> native_module, cp -> cloud_with_pose, tresh -> thresh.
- fastlio2 + replanning_a_star noise cleanup (PR #2095).
- Simulation CLI arg fix (PR #2103) and sklearn mypy ignore.

Conflicts resolved (6 files):
- pgo.py / nav_record.py: kept our typed Out[GraphNodes3D]/Out[LineSegments3D]
  ports, renamed to pose_graph_nodes/edges + loop_closure. Dropped their
  corrected_tf Out[Odometry] port (Python TF-relay path is gone).
- cpp/main.cpp: kept our TFMessage publishing on /tf (parent_frame, body_frame,
  tf_channel args via cli_name_override). Took their --pose_graph_*,
  --loop_closure renames + helper-name nits.
- test_pgo_loop_closure / rosbag: kept --tf_channel + TFMessage; took /lc_test_
  /rb_test_ topic-prefix rename.
- test_pgo_synthetic_drift: took theirs (full Modules+Blueprint rewrite).

Known regression: test_scan_context_catches_reverse_loop is now flaky
(2/6 vs 6/6 on origin tip). Same total keyframes but shifted ~0.8m due to
PGO.start() no longer subscribing to corrected_tf / seeding TF in Python,
so the C++ binary starts processing scans slightly earlier. Marked xfail
(strict=False); root-cause fix tracked separately.
@jeff-hykin jeff-hykin marked this pull request as draft May 16, 2026 11:41
Per Jeff: the constant itself is the hack, not the sleep call.
# Conflicts:
#	dimos/navigation/nav_stack/modules/pgo/cpp/main.cpp
#	dimos/navigation/nav_stack/modules/pgo/pgo.py
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 16, 2026

Greptile Summary

This PR replaces hardcoded TF frame strings (FRAME_MAP, FRAME_ODOM, FRAME_BODY, FRAME_SENSOR) with configurable frame_id/body_frame/parent_frame parameters propagated from create_nav_stack down to every sub-module, and deletes the now-unnecessary frames.py. It also fixes a reversed TF direction in UnityBridgeModule (map→world becomes world→map), moves TF publishing into the PGO C++ binary (replacing the Python pgo_tf Odometry hack with a proper TFMessage), adds ScanContext-based loop-closure to PGO, introduces pose-graph visualization outputs, and adds a new Alfred DIY robot blueprint.

  • Frame refactor: All modules (SimplePlanner, FarPlanner, LocalPlanner, TerrainMapExt, FastLio2, UnityBridge, WavefrontFrontierExplorer) now accept frame names as config fields with defaults matching the new world/map/start_point/current_point hierarchy.
  • PGO overhaul: The C++ binary now publishes a three-frame TF chain (world→map→start_point) directly on the LCM tf channel, exposes pose_graph_nodes, pose_graph_edges, and loop_closure outputs, and integrates ScanContext for geometry-based loop detection.
  • nav_stack_rerun_config max_hz regression: The rewritten throttling logic rebuilds max_hz from only the visual_override key set, silently discarding any user-provided rate limits for entities not in that set.

Confidence Score: 3/5

The frame-ID refactor and TF direction fix are well-structured, but the local planner build is pinned to a mutable feature branch and the rerun config rate-limiting change silently drops user-supplied entries — both need resolution before this is safe on a shared deployment.

Two concrete defects stand out: the LocalPlanner build_command references feat/configurable-body-frame, a branch that can be force-pushed or deleted at any time, breaking reproducible builds for any robot that triggers a rebuild. The nav_stack_rerun_config rewrite discards all user-provided max_hz entries for channels not present in visual_override, which is a silent regression for callers that customise rate-limits for non-standard streams. The rest of the change — frame-name propagation, TF direction fix in Unity, PGO C++ overhaul, and Alfred blueprint — looks sound.

dimos/navigation/nav_stack/modules/local_planner/local_planner.py (unstable branch reference) and dimos/navigation/nav_stack/main.py (max_hz regression in nav_stack_rerun_config)

Important Files Changed

Filename Overview
dimos/navigation/nav_stack/main.py Adds world_frame/map_frame/start_point_frame/current_point_frame parameters and propagates them to all sub-modules; also rewrites nav_stack_rerun_config's max_hz logic in a way that drops user-provided entries for entities not in visual_override.
dimos/navigation/nav_stack/modules/pgo/pgo.py Replaces legacy world_frame/local_frame config with frame_id/child_frame_id/parent_frame/body_frame; moves TF publishing to the C++ binary via a dedicated tf_channel; removes Python-side TF seed and on_tf_correction wiring.
dimos/navigation/nav_stack/modules/pgo/cpp/main.cpp Major rewrite: adds parent_frame/body_frame args, publishes a proper TFMessage (three-frame chain) instead of an Odometry-as-TF hack, adds pose-graph node/edge/loop-closure visualization outputs, and integrates ScanContext for loop closure. Contains traversability encoding inconsistency vs. Python convention.
dimos/navigation/nav_stack/modules/local_planner/local_planner.py Adds body_frame config field and bumps build_command to an unstable feature branch (feat/configurable-body-frame) instead of a tagged release, which is non-reproducible.
dimos/hardware/sensors/lidar/fastlio2/module.py Migrates hardcoded frame IDs (odom/body) to configurable frame_id/child_frame_id/sensor_frame; also publishes a static sensor-mount TF alongside the odometry TF on each odom message.
dimos/simulation/unity/module.py Fixes a TF direction bug: the second Transform was published as map→world (backwards); it now correctly publishes world→map. Frame IDs are now configurable via frame_id/child_frame_id/parent_frame config fields.
dimos/core/module.py Adds double-checked locking for lazy LCMTF init, but places the lock as a class-level attribute shared by all ModuleBase instances, causing unnecessary cross-instance serialization at startup.
dimos/msgs/nav_msgs/LineSegments3D.py Implements lcm_encode (previously NotImplementedError), encoding segment pairs as LCMPath pose pairs with traversability in orientation.w of the first pose (second gets 0.0) — note the C++ encoder sets traversability on both endpoints.

Sequence Diagram

sequenceDiagram
    participant FL as FastLio2 (C++)
    participant PGO as PGO (C++)
    participant TF as TF Buffer (LCMTF)
    participant SP as SimplePlanner
    participant LP as LocalPlanner (C++)

    FL->>TF: Transform(start_point → current_point, odometry pose)
    FL->>TF: Transform(current_point → mid360_link, static mount)
    FL->>PGO: registered_scan + odometry

    PGO->>TF: TFMessage[world→map (identity), map→start_point (correction)]
    PGO->>SP: corrected_odometry
    PGO-->>SP: pose_graph_nodes / pose_graph_edges / loop_closure

    SP->>TF: get(map, current_point)
    TF-->>SP: world→map→start_point→current_point chain
    SP->>LP: "way_point (frame_id=map)"
    LP->>TF: get(map, current_point) for body pose
Loading

Reviews (1): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile

Comment on lines 40 to +42
build_command: str | None = (
"nix build github:dimensionalOS/dimos-module-local-planner/v0.6.0 --no-write-lock-file"
"nix build github:dimensionalOS/dimos-module-local-planner/feat/configurable-body-frame"
" --no-write-lock-file"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Build command pins to an unstable feature branch

feat/configurable-body-frame is a mutable git ref: the branch can be force-pushed or deleted at any time, making builds non-reproducible and potentially silently broken in CI or on a fresh checkout. Before merging this should be cut to a tagged release (e.g., v0.7.0) the same way v0.6.0 was used previously.

Comment thread dimos/navigation/nav_stack/main.py
Comment thread dimos/core/module.py
Comment on lines +250 to 258
_tf_lock: threading.Lock = threading.Lock()

@property
def tf(self): # type: ignore[no-untyped-def]
if self._tf is None:
# self._tf = self.config.tf_transport()
self._tf = LCMTF()
with self._tf_lock:
if self._tf is None:
self._tf = LCMTF()
return self._tf
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _tf_lock is a class-level attribute shared by every ModuleBase instance

Because _tf_lock is declared directly in the class body, all instances share the exact same threading.Lock object. When dozens of modules initialize concurrently at startup, they all serialize on this single lock even though their _tf fields are entirely independent. Moving initialization to __init__ (e.g., self._tf_lock = threading.Lock()) gives each instance its own lock and eliminates the cross-instance contention.

Comment on lines +196 to +216
end_pose.header = dimos::make_header(frame_id, end_ts);
end_pose.pose.position.x = end.x();
end_pose.pose.position.y = end.y();
end_pose.pose.position.z = end.z();
end_pose.pose.orientation.x = 0.0;
end_pose.pose.orientation.y = 0.0;
end_pose.pose.orientation.z = 0.0;
end_pose.pose.orientation.w = traversability;

msg.poses.push_back(start_pose);
msg.poses.push_back(end_pose);
}

// Build a Path-encoded loop-closure-deltas message — one PoseStamped per
// keyframe, where position = (post - delta @ pre) translation delta and
// orientation = delta rotation quaternion. The Nth pose corresponds to
// the Nth keyframe (m_key_poses[N]).
static nav_msgs::Path build_loop_closure_deltas(
const std::vector<std::pair<M3D, V3D>>& pre_poses,
const std::vector<KeyPoseWithCloud>& post_poses,
double ts,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Traversability encoding on end_pose differs from the Python LineSegments3D convention

append_segment sets orientation.w = traversability on both the start and end PoseStamped, but the comment says "traversability is encoded on the first pose of each pair." The Python LineSegments3D.lcm_encode encodes the pair as (p1, trav), (p2, 0.0) — the second endpoint always gets w = 0.0. Any Python consumer that decodes C++-originated graph-edge messages and checks the second pose's w field will observe traversability instead of 0.0, producing a different rendering path than if the same data came from the Python encoder. Aligning the C++ to also set end_pose.pose.orientation.w = 0.0 (or updating the Python side) removes the ambiguity.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
1762 3 1759 28
View the top 3 failed test(s) by shortest run time
dimos.navigation.nav_stack.modules.pgo.test_pgo_synthetic_drift.TestPGOSyntheticDrift::test_position_search_misses_drifted_loop
Stack Traces | 5.6s run time
+ Exception Group Traceback (most recent call last):
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 341, in from_call
  |     result: TResult | None = func()
  |                              ^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 242, in <lambda>
  |     lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
  |     yield from thread_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
  |     yield from unraisable_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
  |     yield from self._runtest_for(item, "call")
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 829, in _runtest_for
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/capture.py", line 898, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
  |     item.runtest()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 1627, in runtest
  |     self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
  |     result = testfunction(**testargs)
  |              ^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 388, in test_position_search_misses_drifted_loop
  |     position_search_events = _run_pgo(use_scan_context=False)
  |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 364, in _run_pgo
  |     coordinator = ModuleCoordinator.build(blueprint)
  |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../core/coordination/module_coordinator.py", line 277, in build
  |     coordinator.build_all_modules()
  |   File ".../core/coordination/module_coordinator.py", line 192, in build_all_modules
  |     safe_thread_map(modules, lambda m: m.build())
  |   File ".../dimos/utils/safe_thread_map.py", line 84, in safe_thread_map
  |     raise ExceptionGroup("safe_thread_map failed", errors)
  | ExceptionGroup: safe_thread_map failed (1 sub-exception)
  +-+---------------- 1 ----------------
    | dimos.protocol.rpc.rpc_utils.RemoteError: [Remote builtins.RuntimeError] [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | Remote traceback:
    | Traceback (most recent call last):
    |   File ".../protocol/rpc/pubsubrpc.py", line 279, in execute_and_respond
    |     response = f(*args[0], **args[1])
    |                ^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 117, in override_f
    |     return getattr(module, fname)(*args, **kwargs)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../dimos/core/native_module.py", line 201, in build
    |     self._maybe_build()
    |   File ".../dimos/core/native_module.py", line 451, in _maybe_build
    |     raise RuntimeError(
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File ".../dimos/utils/safe_thread_map.py", line 68, in safe_thread_map
    |     outcomes[idx] = fut.result()
    |                     ^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 449, in result
    |     return self.__get_result()
    |            ^^^^^^^^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 401, in __get_result
    |     raise self._exception
    |   File "........./usr/lib/python3.12.../concurrent/futures/thread.py", line 58, in run
    |     result = self.fn(*self.args, **self.kwargs)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../core/coordination/module_coordinator.py", line 192, in <lambda>
    |     safe_thread_map(modules, lambda m: m.build())
    |                                        ^^^^^^^^^
    |   File ".../dimos/core/rpc_client.py", line 71, in __call__
    |     result, unsub_fn = self._rpc.call_sync(
    |                        ^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 85, in call_sync
    |     raise result
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    +------------------------------------
dimos.navigation.nav_stack.modules.pgo.test_pgo_synthetic_drift.TestPGOSyntheticDrift::test_scan_context_catches_reverse_loop
Stack Traces | 5.87s run time
+ Exception Group Traceback (most recent call last):
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 341, in from_call
  |     result: TResult | None = func()
  |                              ^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 242, in <lambda>
  |     lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
  |     yield from thread_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
  |     yield from unraisable_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
  |     yield from self._runtest_for(item, "call")
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 829, in _runtest_for
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/capture.py", line 898, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
  |     item.runtest()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 1627, in runtest
  |     self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
  |     result = testfunction(**testargs)
  |              ^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 404, in test_scan_context_catches_reverse_loop
  |     events = _run_pgo(use_scan_context=True, trajectory=_trajectory_reverse_loop())
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 364, in _run_pgo
  |     coordinator = ModuleCoordinator.build(blueprint)
  |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../core/coordination/module_coordinator.py", line 277, in build
  |     coordinator.build_all_modules()
  |   File ".../core/coordination/module_coordinator.py", line 192, in build_all_modules
  |     safe_thread_map(modules, lambda m: m.build())
  |   File ".../dimos/utils/safe_thread_map.py", line 84, in safe_thread_map
  |     raise ExceptionGroup("safe_thread_map failed", errors)
  | ExceptionGroup: safe_thread_map failed (1 sub-exception)
  +-+---------------- 1 ----------------
    | dimos.protocol.rpc.rpc_utils.RemoteError: [Remote builtins.RuntimeError] [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | Remote traceback:
    | Traceback (most recent call last):
    |   File ".../protocol/rpc/pubsubrpc.py", line 279, in execute_and_respond
    |     response = f(*args[0], **args[1])
    |                ^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 117, in override_f
    |     return getattr(module, fname)(*args, **kwargs)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../dimos/core/native_module.py", line 201, in build
    |     self._maybe_build()
    |   File ".../dimos/core/native_module.py", line 451, in _maybe_build
    |     raise RuntimeError(
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File ".../dimos/utils/safe_thread_map.py", line 68, in safe_thread_map
    |     outcomes[idx] = fut.result()
    |                     ^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 449, in result
    |     return self.__get_result()
    |            ^^^^^^^^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 401, in __get_result
    |     raise self._exception
    |   File "........./usr/lib/python3.12.../concurrent/futures/thread.py", line 58, in run
    |     result = self.fn(*self.args, **self.kwargs)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../core/coordination/module_coordinator.py", line 192, in <lambda>
    |     safe_thread_map(modules, lambda m: m.build())
    |                                        ^^^^^^^^^
    |   File ".../dimos/core/rpc_client.py", line 71, in __call__
    |     result, unsub_fn = self._rpc.call_sync(
    |                        ^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 85, in call_sync
    |     raise result
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    +------------------------------------
dimos.navigation.nav_stack.modules.pgo.test_pgo_synthetic_drift.TestPGOSyntheticDrift::test_scan_context_catches_drifted_loop
Stack Traces | 5.94s run time
+ Exception Group Traceback (most recent call last):
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 341, in from_call
  |     result: TResult | None = func()
  |                              ^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 242, in <lambda>
  |     lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
  |     yield from thread_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
  |     yield from unraisable_exception_runtest_hook()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
  |     yield from self._runtest_for(item, "call")
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/logging.py", line 829, in _runtest_for
  |     yield
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/capture.py", line 898, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
  |     return (yield)
  |             ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
  |     item.runtest()
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 1627, in runtest
  |     self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_hooks.py", line 512, in __call__
  |     return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/pluggy/_manager.py", line 120, in _hookexec
  |     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 167, in _multicall
  |     raise exception
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 139, in _multicall
  |     teardown.throw(exception)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
  |     return result.get_result()
  |            ^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12............/site-packages/pluggy/_result.py", line 103, in get_result
  |     raise exc.with_traceback(tb)
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
  |     res = yield
  |           ^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12.............................................................../site-packages/pluggy/_callers.py", line 121, in _multicall
  |     res = hook_impl.function(*args)
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../dimos/dimos/.venv/lib/python3.12....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
  |     result = testfunction(**testargs)
  |              ^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 380, in test_scan_context_catches_drifted_loop
  |     scan_context_events = _run_pgo(use_scan_context=True)
  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../modules/pgo/test_pgo_synthetic_drift.py", line 364, in _run_pgo
  |     coordinator = ModuleCoordinator.build(blueprint)
  |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File ".../core/coordination/module_coordinator.py", line 277, in build
  |     coordinator.build_all_modules()
  |   File ".../core/coordination/module_coordinator.py", line 192, in build_all_modules
  |     safe_thread_map(modules, lambda m: m.build())
  |   File ".../dimos/utils/safe_thread_map.py", line 84, in safe_thread_map
  |     raise ExceptionGroup("safe_thread_map failed", errors)
  | ExceptionGroup: safe_thread_map failed (1 sub-exception)
  +-+---------------- 1 ----------------
    | dimos.protocol.rpc.rpc_utils.RemoteError: [Remote builtins.RuntimeError] [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | Remote traceback:
    | Traceback (most recent call last):
    |   File ".../protocol/rpc/pubsubrpc.py", line 279, in execute_and_respond
    |     response = f(*args[0], **args[1])
    |                ^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 117, in override_f
    |     return getattr(module, fname)(*args, **kwargs)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../dimos/core/native_module.py", line 201, in build
    |     self._maybe_build()
    |   File ".../dimos/core/native_module.py", line 451, in _maybe_build
    |     raise RuntimeError(
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    | 
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File ".../dimos/utils/safe_thread_map.py", line 68, in safe_thread_map
    |     outcomes[idx] = fut.result()
    |                     ^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 449, in result
    |     return self.__get_result()
    |            ^^^^^^^^^^^^^^^^^^^
    |   File "........./usr/lib/python3.12....../concurrent/futures/_base.py", line 401, in __get_result
    |     raise self._exception
    |   File "........./usr/lib/python3.12.../concurrent/futures/thread.py", line 58, in run
    |     result = self.fn(*self.args, **self.kwargs)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".../core/coordination/module_coordinator.py", line 192, in <lambda>
    |     safe_thread_map(modules, lambda m: m.build())
    |                                        ^^^^^^^^^
    |   File ".../dimos/core/rpc_client.py", line 71, in __call__
    |     result, unsub_fn = self._rpc.call_sync(
    |                        ^^^^^^^^^^^^^^^^^^^^
    |   File ".../protocol/rpc/spec.py", line 85, in call_sync
    |     raise result
    | RuntimeError: [PGO(pgo)] Build command failed after 0.00s (exit 127): nix build .#default --no-write-lock-file
    +------------------------------------

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

… synthetic_drift test

Two leftovers from the origin/main merge:

- main.cpp had `bool debug = native_module.arg_bool(...)` followed by
  `bool debug = mod.arg_bool(...)` (the second one a stray from the
  pre-rename "mod" identifier). Failed compilation on a fresh nix build.

- test_pgo_synthetic_drift.py had three `# ---` section banners from
  the better_pgo merge. dimos/project/test_no_sections.py forbids them
  per project convention.
`_maybe_build()` was running inside `start()`. When `global_config.build_native=True`
(forced on this machine via a sitecustomize.py override), the nix cache check
adds 0.3–2.9s of blocking work before Popen. Meanwhile, upstream modules'
async tasks have already started publishing into LCM multicast — but the
subprocess hasn't joined the multicast group yet, so those messages are lost.

For PGO + the synthetic-drift reverse-loop test, this meant the C++ binary
subscribed at the apex of the trajectory and saw only the inbound leg, so
scan-context never had outbound keyframes to match against. Test passed 2/6.

Move the build call to `NativeModule.build()` (the lifecycle method designed
for exactly this kind of heavy one-time work — runs after deploy + wiring but
before any module's start()). Now by the time start() runs, the binary
exists; start() just Popens, the subprocess subscribes within ~50ms, and the
playback's async task hasn't yet pumped any scans.

Drops the 0.3s blind sleep + xfail marker. test_scan_context_catches_reverse_loop
passes 6/6.
… listening

Closes the residual race after the build() phase fix. Previously
NativeModule.start() returned as soon as Popen returned, but the
subprocess hadn't actually called lcm.subscribe() yet. Upstream
publishers running in parallel safe_thread_map starts could pump
messages into the gap and lose them.

C++ side: after lcm.subscribe() calls, the binary writes
"[DIMOS_NATIVE_READY]\n" to stderr and flushes. Plumbed into pgo/main.cpp.

Python side: NativeModule.start() blocks on a threading.Event that the
watchdog's stderr reader sets when it sees the marker. Bounded by
ready_timeout_sec (per-config, default 0 = opt-out for legacy binaries).
PGOConfig sets it to 10s. If the subprocess exits before the marker,
start() returns and lets the watchdog handle cleanup — preserves
existing test_process_crash_triggers_stop semantics.

Coordinator: after start_all_modules() barriers on all start()s, it
fires on_system_ready() on every module. ModuleBase exposes async
wait_for_system_ready() for producers to gate their first publish on.
Used by SyntheticDriftPlaybackModule and the KITTI360 benchmark
playback module.

Net effect on the previously-flaky test_scan_context_catches_reverse_loop:
10/10 reliable. Full fast suite green (1773 passed).
Plumbs [DIMOS_NATIVE_READY] markers into every C++ binary so each Python
wrapper's start() blocks until the subprocess has its LCM subscribes live
(or, for the lidar drivers, its Livox SDK initialized).

In-tree binaries (this repo):
- fastlio2: marker emitted after LivoxLidarSdkStart() succeeds
- livox/mid360: same

External binaries (each gets a feat/dimos-native-ready branch on its
own repo; Python build_command flipped to point at it):
- dimensionalOS/dimos-module-far-planner
- dimensionalOS/dimos-module-path-follower
- dimensionalOS/dimos-module-terrain-analysis
- dimensionalOS/dimos-module-tare-planner
- dimensionalOS/dimos-module-local-planner (already on feat/configurable-body-frame)

Per-config ready_timeout_sec: 10s for compute-only planners, 15s for
the SDK-init-heavy lidar drivers. PGO was already at 10s from the prior
commit.

After this lands, every native subprocess in the nav stack participates
in the start-time handshake. The coordinator's on_system_ready barrier
then unblocks producers (Python-side replay modules) all at once,
closing the parallel-start() race.

Full fast suite green (1773 passed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants