[POC] feat: same slot composability and real time proving #804
[POC] feat: same slot composability and real time proving #804AnshuJalan wants to merge 24 commits intomasterfrom
Conversation
52be092 to
c574668
Compare
* refac: add short commit hash as tag for docker build * fix: image not found in staging when using commit hash
4e92420 to
8fbd4a9
Compare
|
|
||
| - uses: docker/login-action@v3 | ||
| - name: Login to JFrog Artifactory | ||
| uses: docker/login-action@v3 |
Check warning
Code scanning / CodeQL
Unpinned tag for a non-immutable Action in workflow Medium
| steps: | ||
| - uses: docker/login-action@v3 | ||
| - name: Login to JFrog Artifactory | ||
| uses: docker/login-action@v3 |
Check warning
Code scanning / CodeQL
Unpinned tag for a non-immutable Action in workflow Medium
bf7235f to
43f86c4
Compare
* feat: surge actual real time proving * feat: nits * fix: preconfing * raiko interaction * feat: connect with surge verifier * fix: proof body * feat: recovery * feat: update catalyst * fix: bind server to 0.0.0.0 * fix: bind server to 0.0.0.0 * feat: add faster polling * feat: hop proving * fix: temp push image to prod * feat: resilience * feat(realtime): L2 UserOps for bridge-out withdrawals (#922) * feat(realtime): support L2 UserOps for bridge-out (L2→L1 withdrawals) Add the ability for users to submit UserOps that execute on L2, enabling bridge-out functionality. The catalyst now processes both L1→L2 deposits and L2→L1 withdrawals in the same block. Changes: - New `surge_sendL2UserOp` RPC method for submitting L2-targeted UserOps - L2 UserOp execution transactions are constructed and included in L2 blocks - After block execution, existing `find_l1_call()` detects the resulting bridge MessageSent events and relays them to L1 via processMessage - Block building handles mixed deposit + withdrawal transactions - Remove `disable_bridging` gate that prevented bridge handler startup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove unused deposit watcher and plan doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(realtime): single surge_sendUserOp RPC with auto chain detection Instead of separate RPC methods for L1 and L2 UserOps, the single surge_sendUserOp endpoint now auto-detects the target chain by parsing the EIP-712 signature in the executeBatch calldata. The UserOpsSubmitter's EIP-712 domain includes chainId, so the signature is only valid for one chain. We compute the EIP-712 digest for both L1 and L2 chain IDs, ecrecover each, and route accordingly: - L1 signature → L1→L2 deposit flow (simulate on L1, processMessage on L2) - L2 signature → L2 direct execution (UserOp tx in L2 block, L2→L1 relay via find_l1_call) Both types can coexist in the same block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(realtime): use explicit chainId param instead of signature detection The surge_sendUserOp RPC now accepts an optional chainId field in the UserOp params. If chainId matches L2, the UserOp is executed directly on L2. Otherwise defaults to L1 (backwards compatible). Removes the EIP-712 signature parsing logic which was unreliable (ecrecover always returns a non-zero address). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(realtime): increase L2 tx gas limit to 3M processMessage and L2 UserOp transactions need more gas for operations that deploy contracts (e.g. CREATE2 smart wallet deployment via bridge relay). 1M gas was insufficient — the bridge's post-call gas check was failing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(realtime): only increase gas for bridge/UserOp txs, not anchor The anchor tx has a required gas limit enforced by the L2 engine. Revert anchor to 1M, keep processMessage and UserOp txs at 3M. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(realtime): track L2 UserOp status through proving and submission L2 direct UserOps now get ProvingBlock and Executed/Rejected status updates, same as L1 UserOps. Added l2_user_op_ids to Proposal struct and included them in the async submitter's status tracking. Also adds cleanup: status entries are removed from sled after 60s to prevent unbounded disk growth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review feedback on L2 UserOps PR - Fix misleading Processing(zero_hash) status: remove premature status update for L2Direct UserOps; status remains Pending until async_submitter sets ProvingBlock - Propagate add_l2_user_op_id error instead of silently ignoring with let _, which could leave status entries orphaned forever - Remove redundant target_chain variable, simplify to direct chain_id comparison - Fix doc comment: routing is based on chainId field, not EIP-712 signature - Add SAFETY comment for Recovered::new_unchecked explaining why it's correct Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address second round of review feedback - Mark L2 UserOp IDs as Rejected on L1 multicall failure (were stuck at ProvingBlock) - Track L2 UserOp ID before inserting tx into block (prevents executed-but-Rejected state) - Reject UserOps with unknown chainId instead of silently treating as L1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: clarify debug log when L2Direct UserOp is handled The "No pending UserOps" log was misleading when an L2Direct op was processed, since add_pending_user_ops_to_draft_block returns None for both "nothing queued" and "L2Direct handled". Updated to distinguish. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * logs --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: AnshuJalan <anshujalan206@gmail.com> * fix: apply cargo fmt to resolve CI lint failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Justin Chan <justin.chan@nethermind.io> Co-authored-by: Ahmad Bitar <33181301+smartprogrammer93@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: smartprogrammer93 <smartprogrammer@windowslive.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3d0e9ad to
5c661ce
Compare
…env var Restores the match-based fork selection from master so Shasta, Pacaya, and Permissionless modes continue to work. Adds Realtime as a new Fork variant, only activated via FORK=realtime env var override (never auto-selected by timestamps). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @smartprogrammer93's task in 3m 5s —— View job PR Review: [POC] feat: same slot composability and real time provingTasks
SummaryThis is a well-structured POC for same-slot L1↔L2↔L1 composability and real-time ZK proving. The architecture is clear: Critical1. Hardcoded Foundry private key in two places
Recommended: Source the key from an environment variable at minimum, or remove the signature-based proof path from Shasta entirely. Medium2. Silent data loss for multiple UserOps and L1Calls [ // Only first UserOp is processed:
if !batch.user_ops.is_empty() && let Some(user_op) = batch.user_ops.first()
// Only first L1Call is processed:
if !batch.l1_calls.is_empty() && let Some(l1_call) = batch.l1_calls.first()If multiple UserOps or L1Calls are queued in a batch, all but the first are silently dropped at submission time. The batch builder correctly accumulates them ( 3. config.l1_rpc_urls.first().expect("L1 RPC URL is required")The workspace enforces 4. Duplicated blob sidecar construction [ The blob sidecar is built twice: once in 5. UserOp ID counter resets on restart, sled DB persists [ next_id: Arc::new(AtomicU64::new(1)),The in-memory counter starts at 1 on each restart, but 6. Hardcoded bridge/signal service addresses via string formatting [ let bridge_address: Address =
format!("0x{}{}01", chain_id_string, "0".repeat(zeros_needed)).parse()?;
let signal_service: Address =
format!("0x{}{}05", chain_id_string, "0".repeat(zeros_needed)).parse()?;This assumes a specific address derivation from chain ID with no documentation. If the contracts are deployed differently, or the chain ID format changes, this silently produces a wrong address and all bridge operations fail. Should be sourced from the L2 node's contract config or env vars. Fix this → 7. Bridge RPC server binds to hardcoded address [ let bridge_addr: SocketAddr = "0.0.0.0:4545".parse()?;This should come from 8.
9. Fixed gas limits on L2 transactions [ gas_limit: 3_000_000,
max_fee_per_gas: 1_000_000_000,These magic numbers are hardcoded for both 10. Unimplemented stub silently succeeds [ pub async fn transfer_eth_from_l2_to_l1(...) -> Result<(), Error> {
warn!("Implement bridge transfer logic here");
Ok(())
}This returns Low Priority / Style11. No tests for The entire new 12. All Raiko polling responses logged at warn!("Raiko response (attempt {}): HTTP {} | body: {}", ...);Every single polling attempt — including normal 13. let preconf_only = std::env::var("PRECONF_ONLY")
.map(|v| v.to_lowercase() != "false" && v != "0")
.unwrap_or(true);The default is 14.
15. u16::try_from(...).unwrap_or(u16::MAX)If the block count overflows 16. 60-second status cleanup window [ After a UserOp completes, its status entry is deleted after 60 seconds. Users polling after this window see Architecture NotesThe overall design is sound for a POC. Key observations:
|
- Fix silent drop of multiple UserOps/L1Calls in proposal tx builder - Replace expect() with ok_or_else() to comply with clippy rules - Make bridge RPC address configurable via BRIDGE_RPC_ADDR env var - Pass raiko config through EthereumL1Config instead of re-reading env - Reduce Raiko polling log level from warn to debug Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @smartprogrammer93's task in 2m 22s —— View job PR Review: [POC] feat: same slot composability and real time provingTasks
SummaryThis is a well-structured POC for same-slot L1↔L2↔L1 composability and real-time ZK proving. The architecture is clear: Several findings from the previous review pass have been addressed ( Critical1. Hardcoded Foundry private key in two Shasta files
The Recommended: Source the key from an env var at minimum, or gate the Shasta bridge path behind a compile-time feature flag. Fix this → Medium2. assert!(
!self.is_busy(),
"Cannot submit while another submission is in flight"
);
3. Duplicated blob sidecar construction [ The blob sidecar is built twice from the same manifest data: once in 4. UserOp ID counter resets on restart; sled DB persists [ next_id: Arc::new(AtomicU64::new(1)),On restart, new UserOps get IDs starting from 1 again, while the sled status DB retains old entries. Clients polling stale IDs could receive mismatched status responses. IDs should be seeded from the sled DB's current max key, or use a timestamp/UUID-based scheme. Fix this → 5. Bridge/signal service addresses derived from chain ID via string formatting [ let bridge_address: Address =
format!("0x{}{}01", chain_id_string, "0".repeat(zeros_needed)).parse()?;
let signal_service: Address =
format!("0x{}{}05", chain_id_string, "0".repeat(zeros_needed)).parse()?;This derivation is undocumented and fragile. If chain ID format changes or contracts are deployed differently, silent misconfiguration occurs. Should be sourced from config or the L2 node's contract registry. Fix this → 6. Fixed gas limits on L2 transactions [ gas_limit: 3_000_000,
max_fee_per_gas: 1_000_000_000, // 1 gweiBoth 7. pub async fn transfer_eth_from_l2_to_l1(...) -> Result<(), Error> {
warn!("Implement bridge transfer logic here");
Ok(())
}Returns 8. Sled DB path hardcoded [ let status_store = UserOpStatusStore::open("data/user_op_status")?;The relative path Low Priority / Style9. No tests for The entire new 10. let preconf_only = std::env::var("PRECONF_ONLY")
.map(|v| v.to_lowercase() != "false" && v != "0")
.unwrap_or(true);Default is 11.
12. 60-second status cleanup window [ After a UserOp completes, its status entry is removed after 60 seconds. Clients polling after this window receive Architecture Notes
|
- Collapse nested if statements (clippy::collapsible_if) - Use u8::from() instead of as cast (clippy::cast_lossless) - Allow async_fn_in_trait on internal Surge traits - Ignore RUSTSEC-2024-0384 (instant via sled, no safe upgrade) - Remove stale RUSTSEC-2026-0002 ignore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix unused imports in realtime crate - Add #[allow(dead_code)] for POC code not yet wired up - Replace unwrap() with ok_or_else() in shasta proposal tx builder - Collapse nested if statements in shasta execution layer - Fix bool-to-u8 cast, useless conversions, clone-on-Copy - Add #[allow(clippy::large_enum_variant)] on UserOpRouting - Ignore RUSTSEC-2025-0057 (fxhash via sled, no safe upgrade) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gh action to promote tag on prod * fix: gh action to promote tag on prod
#929) * feat: auto-reorg L2 blocks on L1 transaction failure instead of shutting down When an L1 multicall reverts, estimation fails, or the async submission (e.g. Raiko proof fetch) fails, Catalyst now reorgs the preconfirmed L2 blocks back to the last finalized state and resumes the preconfirmation loop immediately, instead of triggering a critical shutdown. The recovery reuses the existing reorg_unproposed_blocks() machinery which reads lastFinalizedBlockHash from L1 on-chain state and calls the driver's reorgStaleBlock endpoint to remove orphaned L2 blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add surge_txStatus RPC method for tracking any L2 tx lifecycle Adds a new `surge_txStatus` JSON-RPC method that accepts either a `userOpId` (existing behavior) or a `txHash` (new: any L2 transaction). For txHash lookups, the handler queries L2 via eth_getTransactionByHash to find the block number, then compares against the last finalized block number on L1 to determine status: - block <= finalized → Executed - block > finalized → ProvingBlock { block_id } - tx pending → error (not yet in a block) - tx not found → error A shared AtomicU64 tracks last_finalized_block_number, updated by BatchManager on successful submission and during reorg recovery. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: return early after reorg recovery to avoid stale head verification After recover_from_failed_submission() reorgs L2 blocks, the l2_slot_info captured at the top of the heartbeat is stale (references pre-reorg head). The head verifier would detect this as an unexpected head and crash. Fix: return Ok(()) immediately after recovery so the next heartbeat picks up fresh L2 state. Also inline check_transaction_error_channel to apply the same early-return pattern for transaction error recovery. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: mark user ops as Rejected when Raiko proof fetch fails When the Raiko proof request fails (network error, prover down, etc.), user ops were left stuck at ProvingBlock status with no transition to Rejected. Now they are explicitly marked as Rejected with the failure reason before returning the error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: recover from unexpected L2 head instead of crashing When an L1 tx reverts after optimistic success reporting, the geth/driver may externally reorg L2 before the TransactionReverted error arrives in the channel. The head verifier would detect the unexpected head and crash before recovery could run. Now the head verifier triggers recover_from_failed_submission() instead of cancel_on_critical_error(), allowing the node to resync with L1 state and continue operating. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply cargo fmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts: - shasta/src/l2/execution_layer.rs: Combined our anchorV4WithSignalSlotsCall rename with master's anchorV3 fallback logic - Cargo.lock: Regenerated from master's base Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Idea
We want the user to be able to perform an L1 -> L2 -> L1 transaction in the same L1 slot. For example:
L2Call)L1Call)Technicalities
UserOpto asurge_sendUserOpendpoint on Catalyst. This user op contains the bridge message for theL2Call.L2Calltransaction into the transaction list of the block it preconfirms.UserOpsubmission on L1 (Step 1) as a "fast signal" into the anchor txn of the L2 block it creates.L1Calls generated from theL2Callexecution of step 2.Proposal submission
A "multicall bundle" of 3 transactions are submitted on L1 in the exact sequence:
The POC is most understandable when we consider 1 block / batch and 1 user op / block.