Skip to content

Conversation

@hlolli
Copy link

@hlolli hlolli commented Dec 8, 2025

No offense taken if this is not pretty code. This improves the visibility around debugging and understanding why dolos stops. Also gives better correctness around exit codes. We assume exiting because syncHeight has been reached to be a successful exit reason, otherwise if it's QuotaReached it's not a successful exit.

Summary by CodeRabbit

  • New Features

    • Pipeline exit codes now differentiate based on termination reason (quota exhaustion exits with code 1).
    • Enhanced quota tracking distinguishes between different quota exhaustion types.
  • Bug Fixes

    • Improved handling of missing or invalid storage paths.
    • Better condition tracking for quota and stop-epoch events.
    • Transitioned key pipeline event logging from debug to informational level.

✏️ Tip: You can customize this high-level summary in your review settings.

@hlolli hlolli requested a review from scarmuega as a code owner December 8, 2025 12:22
@coderabbitai
Copy link

coderabbitai bot commented Dec 8, 2025

Walkthrough

This change introduces a structured pipeline termination system that tracks why a pipeline stops. It defines a PipelineStopReason enum with variants for different termination causes, propagates stop reasons through the pipeline runtime, establishes global atomic flags for quota and stop-epoch conditions, and maps stop reasons to process exit codes in the daemon.

Changes

Cohort / File(s) Summary
Pipeline execution and termination control
src/bin/dolos/common.rs, src/bin/dolos/daemon.rs
Introduces PipelineStopReason enum; changes run_pipeline to return stop reason instead of unit; adds imports for quota and stop-epoch checks; updates logging from debug to info level; daemon maps stop reasons to exit codes (Signal→0, QuotaReached→1, StopEpochReached→0, Other→2) and conditionally exits with non-zero codes
Global stop condition tracking
src/sync/apply.rs
Adds atomic flags STOP_EPOCH_REACHED and QUOTA_REACHED with public setter/getter functions; reworks on_roll_forward to handle StopEpochReached errors by setting flag and returning WorkerError::Panic; routes domain errors through unified error path
Quota management refinement
src/sync/pull.rs
Splits PullQuota::Reached into ReachedTip and ReachedBlockQuota variants; adds consume_blocks() method for block quota tracking; updates transitions in on_tip and quota exhaustion paths with corresponding info logs; calls set_quota_reached() flag when quotas are satisfied
Minor updates
src/bin/dolos/data/summary.rs, src/sync/mod.rs
Import reordering in summary.rs; comment rewrapping in mod.rs—no functional changes

Sequence Diagram

sequenceDiagram
    participant Daemon
    participant Pipeline as run_pipeline
    participant Pull as Pull Quota System
    participant Apply as Apply & Flags
    participant Signal as Signal Handler

    Daemon->>Pipeline: start pipeline execution
    activate Pipeline
    
    loop During execution
        Pull->>Pull: Check block quota
        Pull->>Apply: set_quota_reached(true) if exhausted
        Apply->>Apply: QUOTA_REACHED flag set
        
        Apply->>Apply: Detect StopEpochReached error
        Apply->>Apply: STOP_EPOCH_REACHED flag set
    end
    
    Signal->>Pipeline: Exit signal received
    Note over Pipeline: Check stop conditions
    
    alt StopEpochReached flag set
        Pipeline->>Pipeline: return StopEpochReached
    else QuotaReached flag set
        Pipeline->>Pipeline: return QuotaReached
    else Exit signal
        Pipeline->>Pipeline: return Signal
    else Other condition
        Pipeline->>Pipeline: return Other
    end
    
    Pipeline->>Daemon: PipelineStopReason
    deactivate Pipeline
    
    activate Daemon
    Daemon->>Daemon: Map reason to exit code
    alt exit code == 0
        Daemon->>Daemon: return Ok(())
    else exit code != 0
        Daemon->>Daemon: exit process with code
    end
    deactivate Daemon
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • apply.rs: Review atomic flag initialization, setter/getter implementation, and error handling rework in on_roll_forward to ensure thread-safe flag updates and correct error propagation
  • pull.rs: Verify enum variant transitions (ReachedTip vs ReachedBlockQuota), consume_blocks() logic, and quota state consistency across state machine paths
  • common.rs: Validate PipelineStopReason enum usage, condition detection order (epoch vs quota), and logging behavior changes
  • daemon.rs: Confirm exit code mapping correctness and process exit semantics

Suggested reviewers

  • scarmuega

Poem

🐰 A pipeline stops with reason clear,
No more mystery, the why is here!
Quota reached, epoch passed, or signal sent—
Each exit code tells where it went!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: better visibility around exiting using log, reason and exitCode' directly and accurately summarizes the main changes: introducing logging, a PipelineStopReason enum, and exit codes for pipeline termination, which are the primary focuses across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/bin/dolos/daemon.rs (1)

45-45: Consider handling task panic more gracefully.

The .unwrap() will panic if the spawned run_pipeline task itself panics, which could mask the underlying error. Consider using unwrap_or to provide a fallback stop reason.

-    let stop_reason = sync.await.unwrap();
+    let stop_reason = sync.await.unwrap_or(crate::common::PipelineStopReason::Other);
src/sync/pull.rs (1)

176-189: Unreachable fallback arm.

The _ => arm (lines 185-188) is unreachable because should_quit() only returns true for ReachedTip and ReachedBlockQuota. Consider replacing it with unreachable!() to make the invariant explicit, or removing it entirely since the match is exhaustive for the cases where should_quit() is true.

                 PullQuota::ReachedBlockQuota => {
                     info!("sync quota reached: maximum number of blocks has been processed, stopping sync as configured");
                     set_quota_reached(true);
                 }
-                _ => {
-                    warn!("quota reached, stopping sync");
-                    set_quota_reached(true);
-                }
             }
src/bin/dolos/common.rs (1)

317-328: Consider recursive cleanup for subdirectories.

The current implementation only removes files directly in the root directory but doesn't handle subdirectories. If the storage path contains nested directories (e.g., wal/, chain/, state/), they won't be removed, and the final fs::remove_dir(root) on line 325 will fail because the directory won't be empty.

Consider using fs::remove_dir_all(root) for complete cleanup:

     if root.is_dir() {
-        for entry_result in fs::read_dir(root)? {
-            let entry = entry_result?;
-            let entry_path = entry.path();
-            if entry_path.is_file() {
-                fs::remove_file(&entry_path)?;
-            }
-        }
-        fs::remove_dir(root)?; // Remove the now-empty directory
+        fs::remove_dir_all(root)?;
     } else {
         info!("Path is not a directory, ignoring.");
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a110ac and 2f076cb.

📒 Files selected for processing (6)
  • src/bin/dolos/common.rs (2 hunks)
  • src/bin/dolos/daemon.rs (1 hunks)
  • src/bin/dolos/data/summary.rs (1 hunks)
  • src/sync/apply.rs (2 hunks)
  • src/sync/mod.rs (1 hunks)
  • src/sync/pull.rs (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
src/bin/dolos/data/summary.rs (2)
src/facade.rs (1)
  • dolos_core (31-31)
src/prelude.rs (1)
  • config (44-46)
src/bin/dolos/common.rs (1)
src/sync/apply.rs (2)
  • is_quota_reached (23-25)
  • is_stop_epoch_reached (15-17)
src/bin/dolos/daemon.rs (1)
src/sync/mod.rs (1)
  • sync (57-87)
src/sync/apply.rs (1)
crates/core/src/facade.rs (1)
  • roll_forward (119-133)
src/sync/pull.rs (1)
src/sync/apply.rs (1)
  • set_quota_reached (19-21)
🔇 Additional comments (9)
src/bin/dolos/data/summary.rs (1)

1-3: LGTM!

The import reordering is clean and follows Rust conventions by grouping related imports together.

src/sync/mod.rs (1)

34-35: LGTM!

Trivial comment rewrap with no functional changes.

src/bin/dolos/daemon.rs (1)

49-61: LGTM!

The exit code mapping is clear and aligns with the PR objectives: treating Signal and StopEpochReached as successful exits (0), while QuotaReached (1) and Other (2) are treated as non-successful. Using std::process::exit() for non-zero cases is appropriate for daemon termination.

src/sync/apply.rs (2)

9-25: LGTM!

The global atomic flags with SeqCst ordering provide a simple and correct cross-stage signaling mechanism for the gasket framework. The public API (set_* / is_* functions) cleanly encapsulates the atomic operations.


73-88: LGTM!

The differentiated error handling correctly sets the STOP_EPOCH_REACHED flag before returning WorkerError::Panic, allowing downstream code to distinguish between a graceful stop-epoch termination and an actual error. The logging for other domain errors provides visibility for debugging.

src/sync/pull.rs (2)

44-56: LGTM!

The split into ReachedTip and ReachedBlockQuota variants provides better granularity for understanding termination reasons. The should_quit() and on_tip() methods are correctly updated to reflect the new variants.


60-75: LGTM!

The consume_blocks method correctly uses saturating_sub to prevent underflow and provides informative logging when the quota is exhausted. The state transition to ReachedBlockQuota is properly handled.

src/bin/dolos/common.rs (2)

265-275: LGTM!

The PipelineStopReason enum is well-designed with clear doc comments explaining each variant. Deriving Debug and PartialEq is appropriate for logging and testing purposes.


277-310: LGTM!

The updated run_pipeline function correctly determines and returns the stop reason. The priority order (checking is_stop_epoch_reached() before is_quota_reached()) is sensible, and the upgraded log levels from debug to info improve operational visibility as intended by the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant