Skip to content

new pr#19

Merged
jadenfix merged 3 commits into
mainfrom
jaden/exchange-core
Feb 17, 2026
Merged

new pr#19
jadenfix merged 3 commits into
mainfrom
jaden/exchange-core

Conversation

@jadenfix

Copy link
Copy Markdown
Owner

No description provided.

Copilot AI review requested due to automatic review settings February 16, 2026 23:58
@jadenfix

Copy link
Copy Markdown
Owner Author

@greptile

@greptile-apps

greptile-apps Bot commented Feb 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR introduces a comprehensive trading architecture with strategy lifecycle management, hard safety controls, and multi-tier command routing. The changes add three new core crates (exchange_core, strategy_core, risk_core) and expand the trading daemon with autonomous strategy promotion, kill switch protection, and event tracking.

Major changes:

  • Added hard safety cage with non-bypassable risk limits for autonomous trading
  • Implemented strategy candidate upload/promotion workflow with gate validation
  • Added engine pause/resume/killswitch commands for emergency control
  • Expanded protocol with Engine, Strategy, and Risk command namespaces
  • Added event log with 128-entry ring buffer for recent activity tracking
  • Updated TypeScript bridge with new tools for strategy and risk management

Critical issues:

  • Candidate promotion has a race condition that could allow concurrent promotions to exceed safety limits
  • Promotion notional calculation doesn't account for replacement scenarios, potentially rejecting valid promotions

Confidence Score: 3/5

  • This PR has critical race conditions in the strategy promotion logic that could allow safety limits to be exceeded
  • The architecture is well-designed with clear separation of concerns, but two critical logic bugs in the daemon's promotion handler could compromise the hard safety guarantees. The race condition on line 582 and incorrect notional calculation on line 175 need to be fixed before this can safely handle autonomous strategy deployments.
  • crates/trading_daemon/src/main.rs requires immediate attention for the promotion logic race condition and notional calculation bug

Important Files Changed

Filename Overview
.openclaw/extensions/trading-bridge/src/index.ts Added new trading tools for engine control, strategy management, and risk status with improved event handling
crates/trading_daemon/src/main.rs Major expansion adding strategy lifecycle, risk evaluation, and kill switch logic; potential state synchronization issues with candidate promotion
crates/trading_protocol/src/lib.rs Added new command enums and payload types for Engine, Strategy, and Risk namespaces with comprehensive tests
crates/risk_core/src/lib.rs New hard safety cage implementation with clear limits and thorough validation logic
crates/strategy_core/src/lib.rs New strategy abstraction layer with regime-aware signal generation traits

Sequence Diagram

sequenceDiagram
    participant Agent as OpenClaw Agent
    participant Bridge as Trading Bridge
    participant Daemon as Trading Daemon
    participant Risk as Risk Core
    participant Strategy as Strategy Core

    Agent->>Bridge: trading_strategy_manage(upload_candidate)
    Bridge->>Daemon: Strategy.UploadCandidate
    Daemon->>Daemon: Store candidate with gates
    Daemon-->>Bridge: {ok: true, candidate}
    Bridge-->>Agent: Candidate uploaded

    Agent->>Bridge: trading_strategy_manage(promote_candidate)
    Bridge->>Daemon: Strategy.PromoteCandidate
    Daemon->>Risk: evaluate_promotion(request, snapshot)
    Risk->>Risk: Check kill_switch, gates, notional limits
    alt Promotion allowed
        Risk-->>Daemon: Allow
        Daemon->>Strategy: Activate new version
        Daemon->>Daemon: Update canary_notional tracking
        Daemon->>Daemon: Emit StrategyLifecycle event
        Daemon-->>Bridge: {ok: true, strategy, previous_version}
        Bridge-->>Agent: Promotion succeeded
    else Promotion denied
        Risk-->>Daemon: Deny{reason}
        Daemon->>Daemon: Emit RiskAlert event
        Daemon-->>Bridge: {ok: false, error: reason}
        Bridge-->>Agent: Promotion failed
    end

    Agent->>Bridge: trading_engine_control(killswitch)
    Bridge->>Daemon: Engine.KillSwitch
    Daemon->>Daemon: Set kill_switch_engaged=true
    Daemon->>Daemon: Stop all trading
    Daemon->>Daemon: Emit RiskAlert + EngineHealth
    Daemon-->>Bridge: {ok: true, state}
    Bridge-->>Agent: Kill switch activated
Loading

Last reviewed commit: 50637c9

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +175 to +176
(
"kalshi.market_making",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The projected_strategy calculation doesn't account for replacing an existing candidate - it always adds to the current notional. If promoting a candidate to replace existing code, this could incorrectly reject the promotion.

Suggested change
(
"kalshi.market_making",
let projected_strategy = request.requested_canary_notional_cents;

Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +582 to +583
"ok": false,
"error": format!("Unknown strategy '{}'", payload.strategy_id)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: strategy_canary_notional is updated after promotion succeeds, but if multiple promotions run concurrently, they could both pass the safety check and exceed limits.

Comment on lines +364 to +365
engine: engine.payload,
risk: risk.payload,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Event.RiskAlert check was added to handle risk alerts alongside regular alerts

Comment on lines +163 to +164
risk_snapshot: RiskSnapshot::default(),
safety_policy: HardSafetyPolicy::default(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider persisting strategy state to disk - currently all strategy configuration, versions, and candidates are lost on daemon restart

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly expands the trading bridge infrastructure by introducing a comprehensive command hierarchy, strategy management capabilities, and a hard safety cage for autonomous trading operations. The changes transform the basic control daemon into a production-ready trading engine with multi-layered risk controls.

Changes:

  • Adds three new core crates (exchange_core, strategy_core, risk_core) providing abstraction layers for trading operations
  • Introduces Engine, Strategy, and Risk command namespaces with corresponding handlers for engine control, strategy lifecycle management, and risk monitoring
  • Implements strategy candidate upload/promotion workflow with gate-based validation and hard safety policy enforcement

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
crates/trading_protocol/src/lib.rs Adds EngineCommand, StrategyCommand, RiskCommand enums and RequestKind dispatcher; introduces payload types and event schemas for the expanded protocol
crates/trading_daemon/src/main.rs Implements stateful engine with strategy management, candidate promotion workflow, risk snapshot tracking, and event history
crates/tradingctl/src/main.rs Adds CLI commands for engine control, strategy management, and risk operations
crates/risk_core/src/lib.rs Implements HardSafetyCage with non-bypassable limits for order placement and strategy promotion
crates/strategy_core/src/lib.rs Defines strategy family taxonomy and plugin trait for regime-aware signal generation
crates/exchange_core/src/lib.rs Provides normalized order types and exchange adapter trait for venue-agnostic execution
crates/trading_daemon/Dockerfile Updates build to include new dependency crates
crates/trading_daemon/Cargo.toml Removes kalshi_client dependency, adds risk_core, strategy_core dependencies
.openclaw/extensions/trading-bridge/src/index.ts Adds trading_engine_status, trading_engine_control, trading_strategy_manage, and trading_risk_status tools
crates/README.md Documents new crates and updated command namespaces
crates/Cargo.toml Registers new workspace members
crates/Cargo.lock Reflects dependency tree changes with removal of unused crates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +655 to +658
state
.risk_snapshot
.strategy_canary_notional
.insert(strategy_id.clone(), payload.requested_canary_notional_cents);

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk snapshot's strategy_canary_notional is being set to the new requested value, but it doesn't account for any existing canary notional from a previous deployment. If a strategy already has a canary deployment with X notional, and is promoted again with Y notional, the total should be checked against X+Y, not just Y. This could allow circumventing the hard safety limits by repeatedly promoting with small increments.

Consider using saturating_add to combine the existing value with the new value, or alternatively, replacing the old value if that's the intended behavior (in which case, the promotion evaluation logic in HardSafetyCage should also be adjusted to account for replacement rather than addition).

Copilot uses AI. Check for mistakes.
"connected": true
"connected": true,
}),
),

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Status command now returns Engine.Status but is still labeled as mapping to Control.Status in line 336. This creates inconsistency in the command hierarchy. Consider either keeping ControlCommand::Status as is and mapping it appropriately, or documenting clearly that the Status command has been migrated to EngineCommand. The current implementation mixes both approaches which could confuse users.

Suggested change
),
),
// NOTE: ControlCommand::Status is kept for backward compatibility and
// intentionally returns the engine status (Engine.Status) via `status_response`.
// New code should prefer the engine-level status command where appropriate.

Copilot uses AI. Check for mistakes.
Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +359 to +363
ControlCommand::Stop => {
let mut state = state.lock().await;
state.running = false;
state.paused = true;
state.last_command_at_ms = now_ms();

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Stop command sets both running=false and paused=true, while the Pause command also sets both running=false and paused=true. This makes these two commands functionally identical in terms of state changes. Consider differentiating them (e.g., Stop could indicate a clean shutdown intent while Pause indicates temporary suspension) or document why they have identical effects.

Copilot uses AI. Check for mistakes.
Comment on lines 102 to 302
@@ -36,7 +112,7 @@ async fn main() -> Result<()> {
ensure_socket_parent_dir(&socket_path)?;
let _lock_file = acquire_single_instance_lock(&lock_path)?;
let listener = bind_listener(&socket_path)?;
let state = Arc::new(Mutex::new(EngineState::default()));
let state = Arc::new(Mutex::new(initial_engine_state()));

let terminate = signal::ctrl_c();
tokio::pin!(terminate);
@@ -73,6 +149,72 @@ async fn main() -> Result<()> {
Ok(())
}

fn initial_engine_state() -> EngineState {
let now = now_ms();
EngineState {
running: false,
paused: false,
kill_switch_engaged: false,
risk_tripped: false,
started_at_ms: now,
last_command_at_ms: now,
strategies: default_strategies(),
recent_events: VecDeque::new(),
risk_snapshot: RiskSnapshot::default(),
safety_policy: HardSafetyPolicy::default(),
}
}

fn default_strategies() -> HashMap<String, StrategyState> {
[
(
"kalshi.arbitrage",
StrategyFamily::Arbitrage,
"builtin-kalshi-arb",
),
(
"kalshi.market_making",
StrategyFamily::MarketMaking,
"builtin-kalshi-mm",
),
(
"core.mean_reversion",
StrategyFamily::MeanReversion,
"builtin-mean-reversion",
),
(
"core.momentum",
StrategyFamily::Momentum,
"builtin-momentum",
),
]
.into_iter()
.map(|(id, family, source)| {
(
id.to_string(),
StrategyState {
id: id.to_string(),
enabled: true,
family,
source: source.to_string(),
version: 1,
canary_deployment: false,
canary_notional_cents: 0,
active_code_hash: None,
candidate: None,
},
)
})
.collect()
}

fn now_ms() -> i64 {
match SystemTime::now().duration_since(UNIX_EPOCH) {
Ok(dur) => dur.as_millis() as i64,
Err(_) => 0,
}
}

fn socket_path_from_env() -> String {
std::env::var("TRADING_SOCKET_PATH").unwrap_or_else(|_| DEFAULT_SOCKET_PATH.to_string())
}
@@ -159,61 +301,557 @@ async fn handle_connection(stream: UnixStream, state: Arc<Mutex<EngineState>>) {
}

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trading_daemon crate has no test coverage for the complex state management and command processing logic. Given the critical nature of this trading engine (handling kill switches, risk limits, strategy promotions), comprehensive unit tests should be added to verify state transitions, error conditions, and edge cases. The protocol crate has tests, which shows testing is a codebase convention.

Copilot uses AI. Check for mistakes.
Comment on lines 230 to +231
case "status":
return "Control.Status";
return "Engine.Status";

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command "status" maps to "Engine.Status" in the TypeScript extension, but ControlCommand::Status in the Rust daemon also maps to status_response which returns Engine.Status data. This creates confusion about which namespace the status command belongs to. Additionally, the tradingctl CLI maps the Status command to Engine.Status but labels it as "Legacy status command". Consider consolidating to a single clear API surface.

Copilot uses AI. Check for mistakes.
Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +640 to +646
strategy.version = strategy.version.saturating_add(1);
strategy.enabled = true;
strategy.canary_deployment = true;
strategy.canary_notional_cents = payload.requested_canary_notional_cents;
strategy.active_code_hash = Some(payload.code_hash.clone());
strategy.source = candidate.source;
strategy.candidate = None;

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StrategyState fields enabled, canary_deployment, canary_notional_cents, and active_code_hash are all being set during promotion, but there's no validation that these changes are consistent with the rest of the system state. For instance, when setting enabled=true automatically, there's no check whether the strategy should actually be enabled based on system state (e.g., if the engine is paused or kill switch is engaged). Consider adding validation or documenting why automatic enabling during promotion is always safe.

Copilot uses AI. Check for mistakes.
Comment thread crates/trading_daemon/src/main.rs Outdated
id.to_string(),
StrategyState {
id: id.to_string(),
enabled: true,

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default strategies are hardcoded with enabled=true at initialization. This means all strategies are enabled when the daemon starts, even if the engine itself is not running. Consider whether this is the intended behavior, or if strategies should only be enabled after an explicit Start command or after the operator has reviewed the configuration.

Suggested change
enabled: true,
enabled: false,

Copilot uses AI. Check for mistakes.
Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +382 to +417
let mut state = state.lock().await;
state.paused = true;
state.running = false;
state.risk_snapshot.paused = true;
state.last_command_at_ms = now_ms();
let event = engine_health_event(&state);
push_event(&mut state, event);
status_response(request, &state)
}
EngineCommand::Resume => {
let mut state = state.lock().await;
if state.kill_switch_engaged {
return Envelope::response_to(
request,
json!({
"ok": false,
"error": "Kill switch engaged; cannot resume"
}),
);
}
state.paused = false;
state.running = true;
state.risk_snapshot.paused = false;
state.last_command_at_ms = now_ms();
let event = engine_health_event(&state);
push_event(&mut state, event);
status_response(request, &state)
}
EngineCommand::KillSwitch => {
let mut state = state.lock().await;
state.kill_switch_engaged = true;
state.risk_tripped = true;
state.running = false;
state.paused = true;
state.risk_snapshot.kill_switch_engaged = true;
state.risk_snapshot.paused = true;

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pause field in risk_snapshot is being set directly in the Pause and Resume commands, but it's not being updated in the Start and Stop commands. This could lead to inconsistent state between state.paused and state.risk_snapshot.paused. Either remove the duplicate field or ensure both are always kept in sync.

Copilot uses AI. Check for mistakes.
Comment on lines +342 to +349
if state.kill_switch_engaged {
return Envelope::response_to(
request,
json!({
"ok": false,
"error": "Kill switch engaged; cannot start until operator reset"
}),
);

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kill_switch_engaged check prevents starting the engine, but there's no mechanism provided to reset the kill switch. Once engaged, the kill switch permanently prevents the engine from starting until the daemon is restarted. Consider adding a RiskCommand to reset the kill switch after operator verification, or document the intended recovery procedure.

Copilot uses AI. Check for mistakes.
Comment thread crates/tradingctl/src/main.rs Outdated
Comment on lines +178 to +179
serde_json::from_str(raw)
.unwrap_or_else(|_| serde_json::Value::String(raw.to_string()))

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RiskOverride command silently falls back to treating the value as a string if JSON parsing fails. This could mask configuration errors where the operator intended to pass a JSON object but made a syntax error. Consider logging a warning or returning an error when JSON parsing fails, or at minimum document this fallback behavior clearly.

Suggested change
serde_json::from_str(raw)
.unwrap_or_else(|_| serde_json::Value::String(raw.to_string()))
match serde_json::from_str(raw) {
Ok(v) => v,
Err(e) => {
eprintln!(
"Warning: failed to parse risk override value as JSON ({}). \
Treating it as a plain string instead: {}",
e, raw
);
serde_json::Value::String(raw.to_string())
}
}

Copilot uses AI. Check for mistakes.
@jadenfix

Copy link
Copy Markdown
Owner Author

@greptile

@greptile-apps

greptile-apps Bot commented Feb 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds a comprehensive Rust-based trading infrastructure with OpenClaw integration. The implementation introduces a Unix domain socket daemon that manages trading strategies with hard safety controls, canary deployments, and autonomous code promotion workflows.

Major Changes:

  • Added 6 new Rust crates: trading_protocol, trading_daemon, tradingctl, exchange_core, strategy_core, and risk_core
  • Expanded TypeScript bridge with 4 new tools: trading_engine_status, trading_engine_control, trading_strategy_manage, and trading_risk_status
  • Implemented strategy candidate upload/promotion system with multi-gate validation (compile, replay, paper, latency, risk)
  • Added hard safety cage that enforces non-bypassable risk limits on total notional, per-strategy canary notional, order rate, and drawdown
  • Introduced engine controls: pause/resume and manual kill switch
  • Created Docker build with proper security (non-root user, file locking for single instance)

Critical Issue:

  • Race condition in crates/trading_daemon/src/main.rs:655-658 - concurrent promotions can exceed safety limits because strategy_canary_notional is updated AFTER the promotion succeeds, allowing multiple promotions to pass the check with stale snapshot data

Architecture Notes:

  • The promotion logic correctly treats canary deployments as replacements (not additions) in the risk evaluation
  • Strategy state is entirely in-memory and lost on daemon restart
  • No TTL or expiration mechanism for uploaded candidates

Confidence Score: 3/5

  • This PR has a critical race condition in concurrent strategy promotions that can bypass safety limits
  • While the architecture is solid and the hard safety cage correctly handles promotion as replacement, the race condition at lines 655-658 in the daemon allows concurrent promotions to exceed limits. The snapshot is read before the lock, both promotions pass the check with stale data, then both update the notional sequentially. This is a critical safety violation for autonomous trading systems.
  • Pay close attention to crates/trading_daemon/src/main.rs - the promotion race condition needs to be resolved before production deployment

Important Files Changed

Filename Overview
.openclaw/extensions/trading-bridge/src/index.ts Expanded trading bridge with new engine controls, strategy management, and risk monitoring tools
crates/trading_daemon/src/main.rs Core daemon with strategy promotion logic and risk evaluation; has race condition in concurrent promotions
crates/risk_core/src/lib.rs Hard safety cage implementation correctly treats promotions as replacements, not additions
crates/trading_protocol/src/lib.rs Protocol definitions for UDS communication with envelope structure and command types

Sequence Diagram

sequenceDiagram
    participant Agent as AI Agent/Client
    participant Bridge as Trading Bridge (TS)
    participant Daemon as Trading Daemon (Rust)
    participant Risk as Risk Core
    participant Strategy as Strategy State

    Agent->>Bridge: trading_strategy_manage(upload_candidate)
    Bridge->>Daemon: Strategy.UploadCandidate
    Daemon->>Strategy: Store candidate with gates
    Daemon-->>Bridge: {ok: true, candidate}
    Bridge-->>Agent: Upload confirmation

    Agent->>Bridge: trading_strategy_manage(promote_candidate)
    Bridge->>Daemon: Strategy.PromoteCandidate
    Daemon->>Risk: evaluate_promotion(request, snapshot)
    Risk->>Risk: Check gates, limits, kill switch
    Risk-->>Daemon: Allow/Deny decision
    alt Promotion Allowed
        Daemon->>Strategy: Update version, enable canary
        Daemon->>Daemon: Update risk_snapshot.strategy_canary_notional
        Daemon-->>Bridge: {ok: true, strategy, version}
        Bridge-->>Agent: Promotion success
    else Promotion Denied
        Daemon-->>Bridge: {ok: false, error: reason}
        Bridge-->>Agent: Promotion failed
    end

    Agent->>Bridge: trading_engine_control(pause)
    Bridge->>Daemon: Engine.Pause
    Daemon->>Daemon: Set paused=true, running=false
    Daemon-->>Bridge: Status response
    Bridge-->>Agent: Engine paused

    Daemon->>Bridge: Event.RiskAlert
    Bridge->>Agent: Broadcast alert message
Loading

Last reviewed commit: 4077d8e

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +655 to +658
state
.risk_snapshot
.strategy_canary_notional
.insert(strategy_id.clone(), payload.requested_canary_notional_cents);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition still exists - if two promotions run concurrently, both read the old strategy_canary_notional, pass the safety check, then both update it sequentially, exceeding the limit. The update happens AFTER the promotion succeeds at line 655-658, but the safety check at line 627 reads the stale value.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive, the code is protected by a Mutex correct?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right - this is a false positive. The entire request processing path holds the mutex lock:

  1. process_strategy_request acquires the lock at line 449: let mut state = state.lock().await;
  2. The safety check reads state.risk_snapshot while holding the lock
  3. The promotion updates state.risk_snapshot.strategy_canary_notional while still holding the same lock
  4. The lock is only released when state goes out of scope at the end of the function

Since the mutex is held for the entire duration from reading to writing, no other request can interleave. Concurrent promotion requests will be serialized by the mutex.

My apologies for the incorrect assessment.

risk_tripped: false,
started_at_ms: now,
last_command_at_ms: now,
strategies: default_strategies(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All strategy state (versions, candidates, active code hashes) is lost on daemon restart. If the daemon crashes during a canary deployment, there's no way to recover the deployment state.

Comment on lines +482 to +487
source: typeof input.source === "string" && input.source.trim() !== "" ? input.source : "agent",
code_hash: asString(input.code_hash, "code_hash"),
requested_canary_notional_cents:
typeof input.requested_canary_notional_cents === "number"
? input.requested_canary_notional_cents
: 250,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default value of 250 cents ($2.50) for canary notional seems very low for any meaningful trading strategy testing. Consider if this should be higher or documented more clearly.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +589 to +600
let candidate = match strategy.candidate.clone() {
Some(candidate) => candidate,
None => {
return Envelope::response_to(
request,
json!({
"ok": false,
"error": format!("No uploaded candidate for '{}'", payload.strategy_id)
}),
);
}
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The candidate is cloned from the strategy, but if there's no candidate, an error is returned. However, there's no validation that the candidate hasn't expired or been invalidated since upload.

Comment thread crates/trading_daemon/src/main.rs Outdated
Comment on lines +605 to +613
if candidate.code_hash != payload.code_hash {
return Envelope::response_to(
request,
json!({
"ok": false,
"error": "candidate code hash mismatch"
}),
);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code hash mismatch check is good for preventing stale promotions, but there's no timestamp or TTL check. An uploaded candidate from days/weeks ago could still be promoted if the hash matches.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@jadenfix

Copy link
Copy Markdown
Owner Author

@greptile

@greptile-apps

greptile-apps Bot commented Feb 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR introduces a comprehensive risk management and strategy abstraction layer for the trading daemon. The changes add three new core crates (exchange_core, strategy_core, risk_core) that establish foundational interfaces for venue-agnostic trading and autonomous strategy execution.

Major Changes:

  • Implemented strategy state persistence to /var/run/openclaw/trading-state.json to survive daemon restarts
  • Added HardSafetyCage with non-bypassable risk policies enforcing limits on notional, drawdown, and order rates
  • Introduced strategy candidate system with upload/promote workflow and optional TTL validation
  • Extended trading bridge with new tools for strategy management and risk monitoring
  • Added comprehensive test coverage for promotion logic, TTL validation, and state persistence

Key Safety Features:

  • Mutex-protected promotion path prevents race conditions
  • Strategy promotions require all gates (compile, replay, paper, latency, risk) to pass
  • Canary deployments limited by max_strategy_canary_notional_cents (default 2500 cents)
  • Candidate TTL validation prevents stale code promotion (disabled by default with DEFAULT_CANDIDATE_TTL_MS = 0)

Architecture:
The new abstraction layers (StrategyPlugin, ExchangeAdapter) provide clean separation between strategy logic, execution, and risk management, enabling future extension with custom strategies and exchange integrations.

Confidence Score: 4/5

  • This PR is safe to merge with minor risk - comprehensive safety checks are in place
  • Strong implementation with mutex protection, state persistence, and hard safety cage enforcement. The comprehensive test suite validates critical paths. Score is 4/5 rather than 5/5 because candidate TTL validation is disabled by default (must be explicitly configured via environment variable), and the default canary notional of $2.50 may require tuning for production use cases.
  • Pay close attention to crates/trading_daemon/src/main.rs for the promotion logic and state persistence implementation

Important Files Changed

Filename Overview
crates/trading_daemon/src/main.rs added strategy state persistence, candidate TTL validation, and comprehensive safety checks via HardSafetyCage integration
crates/risk_core/src/lib.rs new hard safety cage implementation with non-bypassable risk policies for autonomous trading strategies
crates/strategy_core/src/lib.rs new strategy abstraction layer defining StrategyPlugin trait and regime-aware signal generation
crates/exchange_core/src/lib.rs new exchange abstraction defining ExchangeAdapter trait for venue-agnostic order execution
.openclaw/extensions/trading-bridge/src/index.ts added new strategy management tools (upload_candidate, promote_candidate) and risk status endpoint to trading bridge
crates/trading_protocol/src/lib.rs expanded protocol with strategy lifecycle events, risk alerts, and candidate promotion payloads

Class Diagram

classDiagram
    class TradingDaemon {
        +EngineState state
        +handle_connection()
        +process_request()
        +promote_candidate_locked()
        +persist_strategy_state()
    }
    
    class EngineState {
        +HashMap~String, StrategyState~ strategies
        +RiskSnapshot risk_snapshot
        +HardSafetyPolicy safety_policy
        +bool running
        +bool paused
    }
    
    class StrategyState {
        +String id
        +bool enabled
        +StrategyFamily family
        +u64 version
        +bool canary_deployment
        +i64 canary_notional_cents
        +Option~String~ active_code_hash
        +Option~StrategyCandidate~ candidate
    }
    
    class StrategyCandidate {
        +String source
        +String code_hash
        +i64 requested_canary_notional_cents
        +bool compile_passed
        +bool replay_passed
        +bool paper_passed
        +bool latency_passed
        +bool risk_passed
    }
    
    class HardSafetyCage {
        +HardSafetyPolicy policy
        +evaluate_promotion()
        +evaluate_order()
    }
    
    class RiskSnapshot {
        +i64 total_notional_cents
        +i64 drawdown_cents
        +u32 orders_last_minute
        +bool kill_switch_engaged
        +HashMap~String, i64~ strategy_canary_notional
    }
    
    class StrategyPlugin {
        <<interface>>
        +id() String
        +family() StrategyFamily
        +evaluate() Result~SignalIntent~
    }
    
    class ExchangeAdapter {
        <<interface>>
        +venue() String
        +place_order() Result~OrderAck~
        +cancel_order() Result
        +sync_positions() Result~Vec~PositionSnapshot~~
    }
    
    class TradingBridge {
        +TradingClient client
        +registerTool trading_engine_status
        +registerTool trading_strategy_manage
        +registerTool trading_risk_status
    }
    
    TradingDaemon --> EngineState
    EngineState --> StrategyState
    EngineState --> RiskSnapshot
    EngineState --> HardSafetyPolicy
    StrategyState --> StrategyCandidate
    TradingDaemon --> HardSafetyCage
    HardSafetyCage --> HardSafetyPolicy
    HardSafetyCage --> RiskSnapshot
    StrategyPlugin --> StrategyFamily
    ExchangeAdapter --> NormalizedOrderRequest
    TradingBridge --> TradingDaemon : UDS socket
Loading

Last reviewed commit: 6b2ef3d

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@jadenfix jadenfix merged commit 4703ac8 into main Feb 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants