Added Petri Consensus by tcsenpai · Pull Request #692 · kynesyslabs/node

tcsenpai · 2026-03-23T12:32:00Z

Summary by CodeRabbit

New Features
- Petri Consensus enabled by default with configurable intervals, thresholds and shard sizing; OmniProtocol TCP now enabled and set to preferred; new transaction finality API endpoint (soft/hard finality + confirmed).
Documentation
- Extensive Petri design, implementation plan, risks, ops guides and reference docs added.
Tests
- Large Petri-focused unit and devnet test suites (block production, relay, soak, inclusion, fault and performance scenarios).
Chores
- Devnet env/docker and test scripts updated to expose Petri/Omni settings and new test commands.

…lysis - 00-codebase-mapping.md: maps existing PoRBFT v2 to Petri concepts - 01-implementation-plan.md: 8-phase implementation plan - 02-risks-and-considerations.md: risks, mitigations, open questions - petri.md: pitch document in markdown - Epic #9 created in mycelium with 37 atomic tasks - Team Mode activated and persisted to AGENTS.md

- Update codebase mapping (v2) with post-stabilisation file paths - Rewrite implementation plan (v2) with corrected paths and finalized design decisions - Replace open questions in risks doc with finalized decisions table - Add VADEMECUM operational guide (team mode, testing, guardrails) - Update mycelium tasks with test phases and dependency links

New phase for exposing soft finality to SDK consumers. Marked as requiring user coordination before touching ../sdks/.

- Create src/libs/consensus/petri/ directory structure - Define TransactionClassification enum (PRE_APPROVED, TO_APPROVE, PROBLEMATIC) - Define StateDelta, PeerDelta interfaces - Define ContinuousForgeRound, ForgeConfig, ForgeState interfaces - Define PetriConfig with defaults (2s forge, 10s block, 7/10 threshold, 5-round TTL) - Define DeltaComparison, RoundDeltaResult interfaces - Add petriConsensus feature flag (default: false) to SharedState - Create index.ts entry point stub with type re-exports - Update VADEMECUM with diagram agent requirement and Phase 8 tests

- Create TransactionClassifier: classify txs as PRE_APPROVED (read-only) or TO_APPROVE (state-changing) based on GCR edit generation - Create SpeculativeExecutor: simulate GCR edits without mutating state, produce deterministic delta hash via canonical JSON + SHA-256 - Create canonicalJson utility: deterministic serialization with sorted keys, BigInt handling, Map/Set support - Extend MempoolTx entity with classification and delta_hash columns - Add classification queries to Mempool: getByClassification(), getPreApproved(), updateClassification() - Wire classifier into endpointValidation.ts (gated by petriConsensus flag) - Add 28 unit tests: canonicalJson determinism, classifier logic, delta hash determinism — all passing - Add Phase 0 architecture diagram

Add Phase 1 data flow, new module boxes (classifier, speculativeExecutor, canonicalJson), feature flag gate visualization, and modified module entries (Mempool entity, mempool_v2). Phase 0 content preserved.

… validation Moved Petri classification from endpointValidation (where tx doesn't exist in mempool yet) to Mempool.addTransaction() (where the row is created). Classification and delta_hash are now saved atomically with the mempool entry. Also removed unused IsNull/Not imports from mempool_v2.

- Create DeltaAgreementTracker: tracks per-tx delta agreement across forge rounds, promotes at 7/10 threshold, flags after 5-round TTL - Create ContinuousForge: 2s forge loop with mempool sync, speculative execution, all-to-all delta exchange, agreement evaluation, and classification updates (TO_APPROVE → PRE_APPROVED or PROBLEMATIC) - Create forgeInstance singleton for RPC handler access - Add petri_exchangeDeltas RPC handler to manageConsensusRoutines (gated by petriConsensus flag) - Wire petriConsensusRoutine() in index.ts to create and start forge - Add 22 new tests: DeltaAgreementTracker (15 tests covering promotion, flagging, TTL, mixed scenarios, boundary cases) + ContinuousForge state lifecycle (7 tests) - Total: 50 tests passing across 5 files

Add Continuous Forge cyclic flow diagram, DeltaAgreementTracker detail, forgeInstance/RPC bridge, and complete round data flow. All P0/P1/P2 components shown with phase annotations.

- PetriBlockCompiler: compiles PRE_APPROVED txs into blocks at 10s boundary using existing orderTransactions() + createBlock() infrastructure - BFTArbitrator: one BFT round per PROBLEMATIC tx — resolve or reject (chain never stalls, rejection is fail-safe) - PetriBlockFinalizer: broadcastBlockHash() + isBlockValid() threshold + insertBlock() + BroadcastManager propagation - petriConsensusRoutine() full lifecycle: forge → pause → arbitrate → compile → finalize → reset → resume - Consensus dispatch switching in mainLoop.ts and manageConsensusRoutines.ts (petriConsensus flag gates which routine is called) - 21 new tests (BFT threshold, result structures, lifecycle, dispatch) - All 71 Petri tests passing

- ShardMapper: getShardForAddress() — single-shard testnet (always 'default'), interface ready for multi-shard expansion - PetriRouter: selectMembers() uses Alea PRNG seeded with tx hash for deterministic routing to exactly 2 shard members - PetriRouter.relay(): sends ValidityData to selected members via existing nodeCall/RELAY_TX RPC pattern (same as DTR but targeted) - endpointExecution.ts: when petriConsensus flag is on, routes via PetriRouter.relay() instead of DTR (early return before validator check) - 16 new tests (ShardMapper, selectMembers determinism/count/uniqueness, routing flag gating, response shapes) - All 87 Petri tests passing

- BFTArbitrator: document roundNumber: -1 sentinel reuse of petri_exchangeDeltas - endpointExecution: flag Petri routing early-return mempool flow for P6 verification

- Add soft_finality_at column to MempoolTx and Transactions entities - Set soft_finality_at in updateClassification() on PRE_APPROVED - Add getTransactionFinality() function (chain-first, then mempool) - Add getTransactionFinality RPC endpoint in rpcDispatch - Re-export Phase 5 components from petri/index.ts - Add finality.test.ts (15 tests covering structure, transitions, RPC shape)

Only set soft_finality_at on first PRE_APPROVED classification, not on subsequent updates. Prevents timestamp overwrite if a tx is re-classified.

- happyPath.test.ts: full lifecycle (classify → agree → compile → finalize) - conflictPath.test.ts: double-spend → PROBLEMATIC → BFT resolution/rejection - byzantineFault.test.ts: Byzantine minority tolerance (f < n/3) - liveness.test.ts: chain never stalls (empty blocks, bounded PROBLEMATIC TTL) - featureFlagRollback.test.ts: clean ON/OFF/ON toggle with state isolation - benchmark.test.ts: DeltaTracker throughput (5K txs), routing (10K calls), BFT O(1) - Add getPetriForgeInstance() getter to forgeInstance.ts 84 new tests (186 total), 0 failures

@deprecated

- Add @deprecated JSDoc to SecretaryManager class (1018 lines) - Add @deprecated to Secretary RPC handlers (setValidatorPhase, greenlight, getValidatorPhase, getBlockTimestamp) in manageConsensusRoutines.ts - Add @deprecated to OmniProtocol consensus handlers (opcodes 0x35-0x38) - Add @deprecated to ValidationPhase types in validationStatusTypes.ts - No deletions — kept for PoRBFT v2 fallback until testnet validation - Task #119 (flag removal) deferred to post-testnet validation

Reflexion finding: the getBlockTimestamp case block lacked its own deprecation comment — only covered indirectly by the getValidatorPhase group comment. Added direct marker for clarity.

- Add test:petri script to package.json (bun run test:petri) - Register Petri consensus in TESTING_MAP.md coverage map

TR1: Configurable Petri params - Add PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS, PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD, PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE to envKeys.ts - Add PetriConsensusConfig interface to config/types.ts - Add petri defaults to config/defaults.ts - Wire through config/loader.ts with envBool/envInt - Load into getSharedState.petriConsensus and petriConfig in index.ts TR2: Docker devnet wiring - Add PETRI_CONSENSUS env var to all 4 nodes in docker-compose.yml - Add PETRI_CONSENSUS to devnet/.env.example (default: false) - Set PETRI_CONSENSUS=true in .env to enable Petri on devnet

Register petri_block_production and petri_tx_inclusion scenarios in loadgen main.ts so they are discoverable via run-scenario.sh and the petri suite in run-suite.ts.

Submits TX to node-1 and verifies propagation to all other nodes via getTx and getTransactionFinality, ensuring Petri consensus relay layer works across the full cluster.

Registers and collects petri_enabled, petri_forge_running, petri_forge_paused, petri_forge_round, petri_pending_tx_count, and petri_tracker_tx_count gauges for Prometheus observability.

Sends sustained load over configurable rounds, measuring TX throughput, soft/hard finality latency (p50/p95/p99), block production rate, and error rate. Outputs baseline JSON summary.

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (4)

testing/petri/secretaryCoordination.test.ts (1)

17-19: ⚠️ Potential issue | 🟠 Major

Test helper doesn't match production election logic.

This electSecretary returns shard[0], but production (in petriSecretary.ts) sorts all identities including the local pubkey before selecting. Tests pass while the shipped code may elect a different secretary. Import and use the real electSecretary and isWeSecretary from the production module, or replicate the exact logic including local pubkey handling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/secretaryCoordination.test.ts` around lines 17 - 19, The test
helper electSecretary in testing/petri/secretaryCoordination.test.ts is
inconsistent with production; replace the stubbed logic by importing and using
the real electSecretary (and isWeSecretary if needed) from the production module
(petriSecretary.ts), or update the helper to precisely replicate production:
include the local pubkey in the identity list, sort deterministically the full
set, then select the first element as production does; update tests to call the
imported electSecretary/isWeSecretary so test election outcomes match production
behavior.

src/libs/consensus/petri/coordination/petriSecretary.ts (3)

142-148: ⚠️ Potential issue | 🔴 Critical

resetCollection() at start wipes early-arriving submissions.

receiveBlockHashSubmission() buffers responses as they arrive via RPC. Calling resetCollection() at line 147 clears that buffer, dropping any submissions received before collectBlockHashes() starts. This creates a race condition where early responses become false timeouts.

🔧 Proposed fix — initialize per-collection state without clearing inbox

 export async function collectBlockHashes(
     shard: Peer[],
     block: Block,
     timeoutMs = 5000,
 ): Promise<CollectionResult> {
-    resetCollection()
+    // Don't clear pendingSubmissions — keep early-arriving responses
+    // Only reset the resolve callback for this collection round
+    collectionResolve = null

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/consensus/petri/coordination/petriSecretary.ts` around lines 142 -
148, Calling resetCollection() at the start of collectBlockHashes wipes the
global inbox and drops early-arriving submissions from
receiveBlockHashSubmission; instead, stop clearing the shared buffer and scope
collection state per-invocation: remove the resetCollection() call from
collectBlockHashes and either (a) create a new per-call collection
map/identifier that collectBlockHashes uses to gather responses while
receiveBlockHashSubmission appends to the shared inbox, or (b) if using a shared
inbox data structure, filter/consume only entries matching the current block id
within collectBlockHashes rather than clearing everything; update
collectBlockHashes to initialize its own local result-tracking (e.g., a local
Map or collectionId) and leave resetCollection() for teardown or explicit global
resets.

112-118: ⚠️ Potential issue | 🟠 Major

electSecretary() fallback to shard[0] is problematic.

When the elected secretary is the local node, found is undefined (since local pubkey isn't in the shard peer list), and the function returns shard[0] — a different peer. This causes handleSecretaryOffline() to probe the wrong member. For a single-node shard, this returns undefined.

The comment says callers should use isWeSecretary, but the function signature promises to return a Peer, which is misleading.

🔧 Proposed fix

-export function electSecretary(shard: Peer[]): Peer {
+export function electSecretary(shard: Peer[]): Peer | null {
     const secretaryId = getSecretaryIdentity(shard)
     const found = shard.find(p => p.identity === secretaryId)
-    // If we are the secretary, return shard[0] as a fallback peer reference
-    // (the caller should use isWeSecretary to decide the code path)
-    return found ?? shard[0]
+    // Returns null if the secretary is the local node (not in shard list)
+    return found ?? null
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/consensus/petri/coordination/petriSecretary.ts` around lines 112 -
118, The electSecretary function currently returns shard[0] as a fallback when
getSecretaryIdentity(...) resolves to the local node (found === undefined),
which misleads callers and causes probing the wrong peer (see
handleSecretaryOffline); change electSecretary to stop returning shard[0] and
instead return an optional Peer (e.g., Peer | undefined or null) when the
secretary identity is not present in shard, update the function signature and
callers (including places using isWeSecretary and handleSecretaryOffline) to
handle the undefined case explicitly, and ensure getSecretaryIdentity,
electSecretary, and any code assuming a guaranteed Peer are adjusted to check
for absence before proceeding.

175-218: ⚠️ Potential issue | 🔴 Critical

Submissions with wrong block number or invalid signatures are re-counted each iteration.

The loop at lines 177-218 only records matching valid submissions in signatures. Submissions with:

Wrong block number (line 180-186): skipped with continue, re-processed next iteration
Mismatched hash (line 210-216): increments mismatchCount but not marked processed
Invalid signature (line 204-208): increments mismatchCount but not marked processed

This causes mismatchCount to inflate on each poll iteration, potentially exceeding totalMembers and making the threshold calculation incorrect.

🔧 Proposed fix — track all processed senders

     const signatures: Record<string, string> = {
         [ourPubkey]: uint8ArrayToHex(ourSignature.signature),
     }
+    const processed = new Set<string>([ourPubkey])
     let matchCount = 1 // counting ourselves
     let mismatchCount = 0

     // ...

     while (Date.now() < deadline && matchCount < totalMembers) {
         for (const [pubkey, submission] of pendingSubmissions) {
-            if (signatures[pubkey]) continue // Already processed
+            if (processed.has(pubkey)) continue // Already processed

             if (submission.blockNumber !== block.number) {
                 log.warn(/* ... */)
+                processed.add(pubkey)
+                mismatchCount++
                 continue
             }

             if (submission.blockHash === expectedHash) {
                 // Verify signature...
                 if (isValid) {
                     signatures[pubkey] = submission.signature
+                    processed.add(pubkey)
                     matchCount++
                 } else {
+                    processed.add(pubkey)
                     mismatchCount++
                 }
             } else {
+                processed.add(pubkey)
                 mismatchCount++
             }
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/consensus/petri/coordination/petriSecretary.ts` around lines 175 -
218, The loop in petriSecretary.ts re-processes the same pubkeys across
iterations because only valid matches are recorded in signatures, causing
mismatchCount to inflate; fix by marking a submission as processed regardless of
outcome: introduce a processedSenders Set (or remove entries from
pendingSubmissions) and, inside the pendingSubmissions for-loop, add pubkey to
processedSenders (or delete from pendingSubmissions) when you skip for wrong
blockNumber, on hash mismatch, or on invalid signature so each sender increments
mismatchCount at most once; keep signatures mapping only for valid signatures
and use processedSenders when deciding whether to skip already-handled senders.

🧹 Nitpick comments (15)

testing/petri/classifier.test.ts (1)

16-40: Unused mockTx helper function.

The mockTx function is defined but never used in any of the tests. All tests call classifyFromEdits directly with edit arrays, bypassing this helper. Consider removing it to reduce dead code, or add tests that use the full transaction mock.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/classifier.test.ts` around lines 16 - 40, The helper function
mockTx is dead code and not used by any tests that call classifyFromEdits;
either delete the mockTx function declaration to remove the unused helper, or
update tests that currently construct edit arrays directly to instead create a
full transaction via mockTx(...) and pass its content.gcr_edits (or equivalent)
into classifyFromEdits so tests exercise the full transaction shape; locate the
mockTx function and adjust or remove it and update any tests referencing edit
arrays to use mockTx when opting to keep the helper.

src/libs/consensus/petri/coordination/petriSecretary.ts (1)

55-56: Module-level mutable state creates test isolation challenges.

pendingSubmissions and collectionResolve are module-level variables. This works for runtime but makes unit testing difficult — tests must call resetCollection() in setup to avoid cross-test pollution. Consider exposing a test-only reset or using a class-based approach.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/consensus/petri/coordination/petriSecretary.ts` around lines 55 -
56, The module-level mutable variables pendingSubmissions and collectionResolve
cause test pollution; encapsulate that state so tests don't need global cleanup.
Either (A) move pendingSubmissions and collectionResolve into an instance of a
PetriSecretary class (make existing exported functions become methods on that
class and update callers to use a singleton or new instance), or (B) keep the
module functions but add and export a test-only resetCollection() that clears
pendingSubmissions (pendingSubmissions.clear()) and sets collectionResolve =
null; reference the symbols pendingSubmissions, collectionResolve, and
resetCollection in your change so tests can call the reset or callers can use
the new class instance.

testing/petri/featureFlagRollback.test.ts (1)

34-62: Dispatch tests validate trivial logic rather than actual routing.

These tests verify that petriConsensus ? "petri" : "porbft" works correctly, but don't test the actual getSharedState.petriConsensus integration or the real dispatch paths in the codebase. Consider adding integration tests that mock getSharedState and verify actual routing behavior.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/featureFlagRollback.test.ts` around lines 34 - 62, Current
tests only assert the ternary expression; update them to mock the actual shared
state and exercise the real dispatch logic: import and spy/mock getSharedState
to return petriConsensus true/false (and sequences), then call the real
routing/dispatch function (the module that consumes getSharedState—e.g., the
dispatcher or router entrypoint used in production) and assert it chose the
"petri" or "porbft" path; also add an integration-style test that toggles the
mocked getSharedState values over time and verifies the dispatcher resolves to
the current value. Ensure you reference and spy on getSharedState and invoke the
actual dispatch function used in your codebase rather than asserting the ternary
expression directly.

petri/03-secretary-coordinated-signing.md (1)

13-28: Add language identifiers to fenced code blocks.

Per markdownlint, fenced code blocks should specify a language for proper syntax highlighting:

Line 13: Add text or plaintext

Line 123: Add typescript or text

Line 180: Add text or plaintext

Also applies to: 123-140, 180-192
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@petri/03-secretary-coordinated-signing.md` around lines 13 - 28, The fenced
code blocks in the document (the block starting with "All 10 shard members
compile block independently (deterministic)" and the other fenced blocks around
lines 123-140 and 180-192) lack language identifiers; update each
triple-backtick fence to include an appropriate language tag such as text or
plaintext (or typescript for the code-like block at ~123 if relevant) so
markdownlint passes and syntax highlighting is applied consistently.

testing/petri/happyPath.test.ts (2)

18-37: Extract shared test helpers to a common module.

mockPeers, bftThreshold, and isBlockValid are duplicated across multiple test files (happyPath.test.ts, blockCompiler.test.ts, routing.test.ts, conflictPath.test.ts, liveness.test.ts, byzantineFault.test.ts).

Consider creating a shared test utilities file:

// testing/petri/helpers.ts
export function mockPeers(count: number): any[] { ... }
export function bftThreshold(totalMembers: number): number { ... }
export function isBlockValid(pro: number, totalVotes: number): boolean { ... }

This would reduce duplication and ensure consistent behavior across all test suites.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/happyPath.test.ts` around lines 18 - 37, Extract the duplicated
helpers into a single module and update tests to import them: create
testing/petri/helpers.ts that exports mockPeers, bftThreshold, and isBlockValid
(copy the existing implementations from happyPath.test.ts), then replace the
local declarations in happyPath.test.ts, blockCompiler.test.ts, routing.test.ts,
conflictPath.test.ts, liveness.test.ts, and byzantineFault.test.ts with imports
from that module (import { mockPeers, bftThreshold, isBlockValid } from
"testing/petri/helpers"), ensuring the exported signatures and behavior remain
identical to the originals (including mockPeers return shape and bftThreshold
logic).

196-256: Duplicate coverage with continuousForge.test.ts.

The "Forge State Machine" tests here overlap significantly with the dedicated continuousForge.test.ts suite. Consider either:

Removing these tests and relying on continuousForge.test.ts for forge lifecycle coverage
Keeping only the happy-path-specific integration aspects here

This would reduce test maintenance burden and clarify test ownership.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/happyPath.test.ts` around lines 196 - 256, These tests
duplicate lifecycle coverage from continuousForge.test.ts; either delete the
overlapping "Happy Path — Forge State Machine" test block or trim it to only
keep higher-level integration assertions (e.g., only the end-to-end happy path
flows that exercise ContinuousForge.start/shard interaction and getState
transitions) while removing unit-like cases already covered (reset,
double-start, and lifecycle sub-assertions). Locate the tests referencing
ContinuousForge, mockPeers, and methods start/pause/resume/reset/stop in
testing/petri/happyPath.test.ts and remove or reduce them to a single
integration test that asserts state transitions for a real shard scenario,
leaving detailed lifecycle specs in continuousForge.test.ts.

testing/petri/blockCompiler.test.ts (3)

200-223: Lifecycle state sequence test has limited value.

This test constructs an array of strings and then asserts it equals the same hardcoded array. It validates the expected sequence as documentation but doesn't exercise any real code.

Consider either:

Importing and testing the actual runBlockPeriod() implementation with mocks

Moving this to inline documentation in the production code
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/blockCompiler.test.ts` around lines 200 - 223, The test
currently only asserts a hardcoded array and doesn't exercise behavior; either
replace it with an integration/unit test that invokes runBlockPeriod() and
verifies the actual pause/compile/finalize/reset/resume lifecycle using
mocks/stubs for pause(), compileBlock(), finalizeBlock(), reset(), and resume()
(asserting the sequence and side-effects), or remove the test and move the
sequence example into inline documentation near runBlockPeriod() as a commented
example; update the test named "forge pause/resume pattern for block
compilation" in testing/petri/blockCompiler.test.ts accordingly and ensure mocks
verify call order on the real functions rather than comparing two identical
arrays.
173-195: Consider removing trivial dispatch tests.

These tests only verify that a JavaScript ternary operator works correctly (flag ? "petri" : "porbft"). They don't actually test any production code paths or integration with the real dispatch logic.

If the intent is to document the expected dispatch behavior, consider importing and testing the actual dispatch function, or convert these to documentation comments instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/blockCompiler.test.ts` around lines 173 - 195, The tests under
"Consensus dispatch switching" are trivially exercising a JavaScript ternary
rather than production logic; remove or replace them with real integration/unit
tests that call the actual dispatch code (e.g., import and invoke the dispatch
function used in mainLoop.ts) instead of local ternary checks in the tests
"petriConsensus flag gates dispatch" and "dispatch function selection is
deterministic"; if keeping a lightweight check for documentation, convert the
two tests into a short comment explaining expected behavior and add one real
test that imports the dispatch implementation and asserts that passing
petriConsensus=true selects the "petri" handler and false selects the "porbft"
handler.
83-169: Result structure tests use inline objects instead of actual types.

These tests construct plain objects matching expected shapes but don't import CompilationResult, ArbitrationResult, or FinalizationResult types from the production code. This means type changes in production won't cause test failures.

Consider importing and using the actual types:
import type { CompilationResult, ArbitrationResult, FinalizationResult } from "@/libs/consensus/petri/block/types"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/blockCompiler.test.ts` around lines 83 - 169, Import the actual
types (CompilationResult, ArbitrationResult, FinalizationResult) and annotate
the test objects with those types instead of using plain inline objects: add
"import type { CompilationResult, ArbitrationResult, FinalizationResult } from
'@/libs/consensus/petri/block/types'" at the top, then change each test's result
declaration to use explicit typing (e.g., "const result: CompilationResult = {
... }", "const result: ArbitrationResult = { ... }", "const result:
FinalizationResult = { ... }") so the tests will fail if the production types
change; update the three describe blocks named "CompilationResult structure",
"ArbitrationResult structure", and "FinalizationResult structure" accordingly.

testing/petri/conflictPath.test.ts (1)

273-275: Move helper function definition to top of file.

isBlockValid is defined at the end of the file but used earlier (line 242). While function hoisting makes this work, it's cleaner and more readable to define helpers near the top with bftThreshold:

♻️ Suggested reordering

 // ---- Helpers ----
 
 function bftThreshold(totalMembers: number): number {
     return Math.floor((totalMembers * 2) / 3) + 1
 }
+
+function isBlockValid(pro: number, total: number): boolean {
+    return pro >= bftThreshold(total)
+}
 
 // ---- Conflict Path: Delta Disagreement → PROBLEMATIC ----
 ...
-
-function isBlockValid(pro: number, total: number): boolean {
-    return pro >= bftThreshold(total)
-}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/conflictPath.test.ts` around lines 273 - 275, Move the helper
function isBlockValid so it's declared near the top of the file alongside
bftThreshold before any tests that call it; update the file by cutting the
isBlockValid(pro: number, total: number): boolean { return pro >=
bftThreshold(total) } definition from the end and pasting it above the tests
(near bftThreshold) so readers see the helper before it's used.

testing/petri/finality.test.ts (1)

200-215: Simplify classification timestamp logic test.

The loop with nested conditionals is harder to read than necessary. Consider a more direct approach:

♻️ Suggested simplification

     test("only PRE_APPROVED classification triggers soft_finality_at", () => {
-        const classifications = [
-            "TO_APPROVE",
-            "PROBLEMATIC",
-            "PRE_APPROVED",
-        ]
-
-        for (const cls of classifications) {
-            const shouldSetTimestamp = cls === "PRE_APPROVED"
-            if (cls === "PRE_APPROVED") {
-                expect(shouldSetTimestamp).toBe(true)
-            } else {
-                expect(shouldSetTimestamp).toBe(false)
-            }
-        }
+        const shouldSet = (cls: string) => cls === "PRE_APPROVED"
+        
+        expect(shouldSet("TO_APPROVE")).toBe(false)
+        expect(shouldSet("PROBLEMATIC")).toBe(false)
+        expect(shouldSet("PRE_APPROVED")).toBe(true)
     })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/finality.test.ts` around lines 200 - 215, The test "only
PRE_APPROVED classification triggers soft_finality_at" currently uses an
unnecessary inner if; simplify by asserting the boolean expression directly for
each classification. Replace the loop body so that for each cls in
classifications you call expect(cls === "PRE_APPROVED").toBe(expectedValue) (or
build an expected array like [false, false, true] and assert
classifications.map(c => c === "PRE_APPROVED") equals that array). Update the
test function (the test case with that name) to use this direct assertion
pattern rather than the nested conditional.

testing/petri/routing.test.ts (1)

61-78: Probabilistic test assertion is reasonable but could be more robust.

The test checks that "at least some should differ" with diffCount > 5 out of 20 pairs. This is reasonable for a PRNG-based selection, but the threshold is somewhat arbitrary.

Consider documenting the expected collision probability or using a higher iteration count for more statistical confidence:
// With 10 peers and 2 selections, the probability of two random selections
// being identical is roughly 1/45 (C(10,2)). Over 20 pairs, we expect ~19-20
// to differ. Threshold of 5 gives us very high confidence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/petri/routing.test.ts` around lines 61 - 78, The probabilistic
assertion in the test using selectMembers and mockPeers is weak; increase
confidence by running more trials and tightening the threshold or documenting
the expected collision probability—e.g., run 100 iterations by looping i from
0..99, compute diffCount as you do now, and assert a higher minimum (such as
expect(diffCount).toBeGreaterThan(80)); additionally add a brief comment near
the test explaining the expected collision probability given 10 peers and 2
selections so the threshold is understandable.

testing/petri/benchmark.test.ts (1)

24-29: Consider adding proper type for mock peers.

The any[] return type loses type safety. Consider defining an interface that matches what the actual peer structure expects to catch type mismatches early.

+interface MockPeer {
+    identity: string
+    connection: { string: string }
+}
+
-function mockPeers(count: number): any[] {
+function mockPeers(count: number): MockPeer[] {
     return Array.from({ length: count }, (_, i) => ({
         identity: `peer_${String(i).padStart(3, "0")}`,
         connection: { string: `localhost:${3000 + i}` },
     }))
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@testing/petri/benchmark.test.ts` around lines 24 - 29, The mockPeers function
currently returns any[] losing type safety; define a proper interface (e.g.,
Peer or PeerInfo) matching the shape used by tests — at least identity: string
and connection: { string: string } — or import the existing peer type from the
codebase, then change function mockPeers(count: number): any[] to
mockPeers(count: number): Peer[] (or the imported type) and update any
references so the tests use the strongly typed peer objects.

petri/01-implementation-plan.md (1)

60-60: Minor: Add period after "etc".

In American English, abbreviations like "etc." require a period.

-| Chain (blocks) | `src/libs/blockchain/chainBlocks.ts` | `insertBlock()`, `getBlockByNumber()` etc |
+| Chain (blocks) | `src/libs/blockchain/chainBlocks.ts` | `insertBlock()`, `getBlockByNumber()`, etc. |

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@petri/01-implementation-plan.md` at line 60, Minor punctuation fix: update
the table entry for "Chain (blocks)" in petri/01-implementation-plan.md so the
abbreviation "etc" is written as "etc." (add the missing period) — specifically
edit the cell showing `src/libs/blockchain/chainBlocks.ts` and the function list
(`insertBlock()`, `getBlockByNumber()` etc) to read `insertBlock()`,
`getBlockByNumber()` etc.; ensuring the period follows "etc.".

petri/VADEMECUM.md (1)

32-34: Add language specifiers to fenced code blocks.

Several fenced code blocks lack language specifiers. Adding them improves syntax highlighting and satisfies markdown linting rules.
✏️ Suggested fixes

For file path listings (lines 32-34, 149-164, etc.), use text or leave empty but consistent:
-```
+```text
 testing/petri/
For command examples (lines 65-73, 82-94), already correctly use `bash`.

For status format templates (lines 97-101), use `text`:
```diff
-```
+```text
 [PHASE X] Starting: <phase name>
Also applies to: 97-101, 149-164
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@petri/VADEMECUM.md` around lines 32 - 34, Several fenced code blocks (e.g.,
the file-path listing "testing/petri/" and the status-template "[PHASE X]
Starting: <phase name>") are missing language specifiers; update those
triple-backtick blocks to include appropriate languages (use "text" for file
listings and status templates) while keeping existing "bash" specifiers for
command examples, so all fenced blocks have explicit language markers and
satisfy markdown linting.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@petri/03-secretary-coordinated-signing.md`:
- Line 54: The doc declares petri_submitBlockHash returns { status: "collected"
| "mismatch" | "error" } but receiveBlockHashSubmission in petriSecretary.ts
currently always returns { status: "collected" }; either implement the missing
branches or align the doc: update receiveBlockHashSubmission to detect and
return "mismatch" when the submitted hash disagrees with the expected/previously
collected hash and return "error" on exceptional conditions (e.g., validation
failure or internal exception), ensuring these branches emit the correct status
string, or instead update petri/03-secretary-coordinated-signing.md to only list
{ status: "collected" } if you choose not to implement mismatch/error; reference
petri_submitBlockHash and receiveBlockHashSubmission when making the change.

In `@testing/petri/benchmark.test.ts`:
- Around line 268-286: The test "forge getCurrentDeltas returns copy (no
reference leak)" currently calls forge.start(...) and then forge.stop() but may
leak timers if an assertion throws; wrap the test body so that the
ContinuousForge instance created from ContinuousForge
(require("@/libs/consensus/petri/forge/continuousForge").ContinuousForge) is
always cleaned up by using a try/finally where forge.start(mockPeers(3)) is
before the try and forge.stop() is invoked in the finally block to ensure
forge.stop() runs even when expect(...) assertions fail.

In `@testing/petri/byzantineFault.test.ts`:
- Around line 29-30: Remove the redundant explicit type annotations on the two
test variables: change the declarations of ttlRounds: number = 5 and
currentRound: number = 1 to rely on type inference (ttlRounds = 5 and
currentRound = 1) so ESLint no longer flags trivially inferred types; update the
declarations where ttlRounds and currentRound are defined in
byzantineFault.test.ts accordingly.

In `@testing/petri/finality.test.ts`:
- Around line 112-125: The test currently only assigns softFinalityAt locally
and doesn't verify idempotency; change it to call the real updateClassification
twice and assert the stored timestamp stays the same: make the test async,
create or use a txHash, call updateClassification(txHash, "PRE_APPROVED"), read
the stored value via getTransactionFinality(txHash) (or the equivalent getter),
call updateClassification(txHash, "PRE_APPROVED") again, read the value a second
time, and assert that first.softFinalityAt === second.softFinalityAt to ensure
the timestamp is not overwritten.

In `@testing/petri/happyPath.test.ts`:
- Line 11: Remove the unused import "beforeEach" from the import statement that
currently imports describe, expect, test, mock, beforeEach (e.g., the import
line referencing bun:test) so the file only imports the symbols actually used
(describe, expect, test, mock); update the import to omit beforeEach to avoid
the unused-import lint/error.

In `@testing/petri/speculativeExecutor.test.ts`:
- Around line 12-32: The test helper computeDeltaHash should mirror production
fallback behavior: when mapping gcrEdits to editsForHashing use e.operation ??
"" and e.account ?? "" for operation and account (keep the existing
bigint/string handling for amount), then pass that through canonicalJson and
Hashing.sha256 as before so the test helper matches speculativeExecutor.ts
behavior exactly.

---

Duplicate comments:
In `@src/libs/consensus/petri/coordination/petriSecretary.ts`:
- Around line 142-148: Calling resetCollection() at the start of
collectBlockHashes wipes the global inbox and drops early-arriving submissions
from receiveBlockHashSubmission; instead, stop clearing the shared buffer and
scope collection state per-invocation: remove the resetCollection() call from
collectBlockHashes and either (a) create a new per-call collection
map/identifier that collectBlockHashes uses to gather responses while
receiveBlockHashSubmission appends to the shared inbox, or (b) if using a shared
inbox data structure, filter/consume only entries matching the current block id
within collectBlockHashes rather than clearing everything; update
collectBlockHashes to initialize its own local result-tracking (e.g., a local
Map or collectionId) and leave resetCollection() for teardown or explicit global
resets.
- Around line 112-118: The electSecretary function currently returns shard[0] as
a fallback when getSecretaryIdentity(...) resolves to the local node (found ===
undefined), which misleads callers and causes probing the wrong peer (see
handleSecretaryOffline); change electSecretary to stop returning shard[0] and
instead return an optional Peer (e.g., Peer | undefined or null) when the
secretary identity is not present in shard, update the function signature and
callers (including places using isWeSecretary and handleSecretaryOffline) to
handle the undefined case explicitly, and ensure getSecretaryIdentity,
electSecretary, and any code assuming a guaranteed Peer are adjusted to check
for absence before proceeding.
- Around line 175-218: The loop in petriSecretary.ts re-processes the same
pubkeys across iterations because only valid matches are recorded in signatures,
causing mismatchCount to inflate; fix by marking a submission as processed
regardless of outcome: introduce a processedSenders Set (or remove entries from
pendingSubmissions) and, inside the pendingSubmissions for-loop, add pubkey to
processedSenders (or delete from pendingSubmissions) when you skip for wrong
blockNumber, on hash mismatch, or on invalid signature so each sender increments
mismatchCount at most once; keep signatures mapping only for valid signatures
and use processedSenders when deciding whether to skip already-handled senders.

In `@testing/petri/secretaryCoordination.test.ts`:
- Around line 17-19: The test helper electSecretary in
testing/petri/secretaryCoordination.test.ts is inconsistent with production;
replace the stubbed logic by importing and using the real electSecretary (and
isWeSecretary if needed) from the production module (petriSecretary.ts), or
update the helper to precisely replicate production: include the local pubkey in
the identity list, sort deterministically the full set, then select the first
element as production does; update tests to call the imported
electSecretary/isWeSecretary so test election outcomes match production
behavior.

---

Nitpick comments:
In `@petri/01-implementation-plan.md`:
- Line 60: Minor punctuation fix: update the table entry for "Chain (blocks)" in
petri/01-implementation-plan.md so the abbreviation "etc" is written as "etc."
(add the missing period) — specifically edit the cell showing
`src/libs/blockchain/chainBlocks.ts` and the function list (`insertBlock()`,
`getBlockByNumber()` etc) to read `insertBlock()`, `getBlockByNumber()` etc.;
ensuring the period follows "etc.".

In `@petri/03-secretary-coordinated-signing.md`:
- Around line 13-28: The fenced code blocks in the document (the block starting
with "All 10 shard members compile block independently (deterministic)" and the
other fenced blocks around lines 123-140 and 180-192) lack language identifiers;
update each triple-backtick fence to include an appropriate language tag such as
text or plaintext (or typescript for the code-like block at ~123 if relevant) so
markdownlint passes and syntax highlighting is applied consistently.

In `@petri/VADEMECUM.md`:
- Around line 32-34: Several fenced code blocks (e.g., the file-path listing
"testing/petri/" and the status-template "[PHASE X] Starting: <phase name>") are
missing language specifiers; update those triple-backtick blocks to include
appropriate languages (use "text" for file listings and status templates) while
keeping existing "bash" specifiers for command examples, so all fenced blocks
have explicit language markers and satisfy markdown linting.

In `@src/libs/consensus/petri/coordination/petriSecretary.ts`:
- Around line 55-56: The module-level mutable variables pendingSubmissions and
collectionResolve cause test pollution; encapsulate that state so tests don't
need global cleanup. Either (A) move pendingSubmissions and collectionResolve
into an instance of a PetriSecretary class (make existing exported functions
become methods on that class and update callers to use a singleton or new
instance), or (B) keep the module functions but add and export a test-only
resetCollection() that clears pendingSubmissions (pendingSubmissions.clear())
and sets collectionResolve = null; reference the symbols pendingSubmissions,
collectionResolve, and resetCollection in your change so tests can call the
reset or callers can use the new class instance.

In `@testing/petri/benchmark.test.ts`:
- Around line 24-29: The mockPeers function currently returns any[] losing type
safety; define a proper interface (e.g., Peer or PeerInfo) matching the shape
used by tests — at least identity: string and connection: { string: string } —
or import the existing peer type from the codebase, then change function
mockPeers(count: number): any[] to mockPeers(count: number): Peer[] (or the
imported type) and update any references so the tests use the strongly typed
peer objects.

In `@testing/petri/blockCompiler.test.ts`:
- Around line 200-223: The test currently only asserts a hardcoded array and
doesn't exercise behavior; either replace it with an integration/unit test that
invokes runBlockPeriod() and verifies the actual
pause/compile/finalize/reset/resume lifecycle using mocks/stubs for pause(),
compileBlock(), finalizeBlock(), reset(), and resume() (asserting the sequence
and side-effects), or remove the test and move the sequence example into inline
documentation near runBlockPeriod() as a commented example; update the test
named "forge pause/resume pattern for block compilation" in
testing/petri/blockCompiler.test.ts accordingly and ensure mocks verify call
order on the real functions rather than comparing two identical arrays.
- Around line 173-195: The tests under "Consensus dispatch switching" are
trivially exercising a JavaScript ternary rather than production logic; remove
or replace them with real integration/unit tests that call the actual dispatch
code (e.g., import and invoke the dispatch function used in mainLoop.ts) instead
of local ternary checks in the tests "petriConsensus flag gates dispatch" and
"dispatch function selection is deterministic"; if keeping a lightweight check
for documentation, convert the two tests into a short comment explaining
expected behavior and add one real test that imports the dispatch implementation
and asserts that passing petriConsensus=true selects the "petri" handler and
false selects the "porbft" handler.
- Around line 83-169: Import the actual types (CompilationResult,
ArbitrationResult, FinalizationResult) and annotate the test objects with those
types instead of using plain inline objects: add "import type {
CompilationResult, ArbitrationResult, FinalizationResult } from
'@/libs/consensus/petri/block/types'" at the top, then change each test's result
declaration to use explicit typing (e.g., "const result: CompilationResult = {
... }", "const result: ArbitrationResult = { ... }", "const result:
FinalizationResult = { ... }") so the tests will fail if the production types
change; update the three describe blocks named "CompilationResult structure",
"ArbitrationResult structure", and "FinalizationResult structure" accordingly.

In `@testing/petri/classifier.test.ts`:
- Around line 16-40: The helper function mockTx is dead code and not used by any
tests that call classifyFromEdits; either delete the mockTx function declaration
to remove the unused helper, or update tests that currently construct edit
arrays directly to instead create a full transaction via mockTx(...) and pass
its content.gcr_edits (or equivalent) into classifyFromEdits so tests exercise
the full transaction shape; locate the mockTx function and adjust or remove it
and update any tests referencing edit arrays to use mockTx when opting to keep
the helper.

In `@testing/petri/conflictPath.test.ts`:
- Around line 273-275: Move the helper function isBlockValid so it's declared
near the top of the file alongside bftThreshold before any tests that call it;
update the file by cutting the isBlockValid(pro: number, total: number): boolean
{ return pro >= bftThreshold(total) } definition from the end and pasting it
above the tests (near bftThreshold) so readers see the helper before it's used.

In `@testing/petri/featureFlagRollback.test.ts`:
- Around line 34-62: Current tests only assert the ternary expression; update
them to mock the actual shared state and exercise the real dispatch logic:
import and spy/mock getSharedState to return petriConsensus true/false (and
sequences), then call the real routing/dispatch function (the module that
consumes getSharedState—e.g., the dispatcher or router entrypoint used in
production) and assert it chose the "petri" or "porbft" path; also add an
integration-style test that toggles the mocked getSharedState values over time
and verifies the dispatcher resolves to the current value. Ensure you reference
and spy on getSharedState and invoke the actual dispatch function used in your
codebase rather than asserting the ternary expression directly.

In `@testing/petri/finality.test.ts`:
- Around line 200-215: The test "only PRE_APPROVED classification triggers
soft_finality_at" currently uses an unnecessary inner if; simplify by asserting
the boolean expression directly for each classification. Replace the loop body
so that for each cls in classifications you call expect(cls ===
"PRE_APPROVED").toBe(expectedValue) (or build an expected array like [false,
false, true] and assert classifications.map(c => c === "PRE_APPROVED") equals
that array). Update the test function (the test case with that name) to use this
direct assertion pattern rather than the nested conditional.

In `@testing/petri/happyPath.test.ts`:
- Around line 18-37: Extract the duplicated helpers into a single module and
update tests to import them: create testing/petri/helpers.ts that exports
mockPeers, bftThreshold, and isBlockValid (copy the existing implementations
from happyPath.test.ts), then replace the local declarations in
happyPath.test.ts, blockCompiler.test.ts, routing.test.ts, conflictPath.test.ts,
liveness.test.ts, and byzantineFault.test.ts with imports from that module
(import { mockPeers, bftThreshold, isBlockValid } from "testing/petri/helpers"),
ensuring the exported signatures and behavior remain identical to the originals
(including mockPeers return shape and bftThreshold logic).
- Around line 196-256: These tests duplicate lifecycle coverage from
continuousForge.test.ts; either delete the overlapping "Happy Path — Forge State
Machine" test block or trim it to only keep higher-level integration assertions
(e.g., only the end-to-end happy path flows that exercise
ContinuousForge.start/shard interaction and getState transitions) while removing
unit-like cases already covered (reset, double-start, and lifecycle
sub-assertions). Locate the tests referencing ContinuousForge, mockPeers, and
methods start/pause/resume/reset/stop in testing/petri/happyPath.test.ts and
remove or reduce them to a single integration test that asserts state
transitions for a real shard scenario, leaving detailed lifecycle specs in
continuousForge.test.ts.

In `@testing/petri/routing.test.ts`:
- Around line 61-78: The probabilistic assertion in the test using selectMembers
and mockPeers is weak; increase confidence by running more trials and tightening
the threshold or documenting the expected collision probability—e.g., run 100
iterations by looping i from 0..99, compute diffCount as you do now, and assert
a higher minimum (such as expect(diffCount).toBeGreaterThan(80)); additionally
add a brief comment near the test explaining the expected collision probability
given 10 peers and 2 selections so the threshold is understandable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 01c9238a-e2ec-4f83-9b93-f64e3d8b3e00

📥 Commits

Reviewing files that changed from the base of the PR and between 5e9ded4 and 5d2fffd.

📒 Files selected for processing (22)

package.json
petri/01-implementation-plan.md
petri/03-secretary-coordinated-signing.md
petri/VADEMECUM.md
petri/architecture-diagram.md
src/libs/consensus/petri/coordination/petriSecretary.ts
src/libs/consensus/petri/utils/canonicalJson.ts
testing/petri/benchmark.test.ts
testing/petri/blockCompiler.test.ts
testing/petri/byzantineFault.test.ts
testing/petri/canonicalJson.test.ts
testing/petri/classifier.test.ts
testing/petri/conflictPath.test.ts
testing/petri/continuousForge.test.ts
testing/petri/deltaTracker.test.ts
testing/petri/featureFlagRollback.test.ts
testing/petri/finality.test.ts
testing/petri/happyPath.test.ts
testing/petri/liveness.test.ts
testing/petri/routing.test.ts
testing/petri/secretaryCoordination.test.ts
testing/petri/speculativeExecutor.test.ts

✅ Files skipped from review due to trivial changes (2)

src/libs/consensus/petri/utils/canonicalJson.ts
package.json

petri/03-secretary-coordinated-signing.md

testing/petri/benchmark.test.ts

testing/petri/byzantineFault.test.ts

testing/petri/finality.test.ts

testing/petri/happyPath.test.ts

testing/petri/speculativeExecutor.test.ts

Add PETRI_AGREEMENT_THRESHOLD, PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE with their defaults so operators can discover all tuning knobs. Closes mycelium #148 (Q-6).

The hash is computed over normalized edits ({type, operation, account, amount}), not the full GCREdit. Closes mycelium #146 (CR-9).

The ok boolean required hardFinalityObserved but not softFinalityObserved, despite the scenario claiming to verify softFinalityAt. Closes mycelium #145 (CR-19).

Reject invalid Petri config values (non-positive intervals, shardSize, agreementThreshold out of range, negative TTL) at startup instead of silently breaking consensus. Closes mycelium #147 (CR-2).

blockNumber=0 (genesis) is a valid reference but was filtered out by truthy check. Change to !== undefined in both getAll() and getByClassification(). Closes mycelium #142 (Q-5).

trackedCount is a getter property, not a function — the typeof check for "function" always failed and petri_tracker_tx_count was never set. Add getTrackerCount() on ContinuousForge and use it in MetricsCollector instead of the unsafe (forge as any).tracker cast. Closes mycelium #143 (Q-4/CR-3).

A resolved TX from arbitration could still be in the mempool, causing duplicates in the block. Use a hash-keyed Map to deduplicate before ordering. Closes mycelium #144 (CR-4).

If getCommonValidatorSeed() or getShard() threw before entering petriConsensusRoutine, startingConsensus stayed true permanently, blocking all future consensus rounds. Wrap in try/finally. Closes mycelium #136 (CR-17).

Mempool cleanup used transactionEntities (all TXs) instead of committedTxHashes (only successfully inserted). Skipped TXs were silently removed from mempool despite not being persisted. Closes mycelium #138 (Q-3).

Unhandled exceptions from getTransactionFinality() propagated without a structured error response, unlike neighboring RPC cases. Closes mycelium #139 (CR-16).

If a round had no TO_APPROVE txs and returned early, stale deltas from the previous round were served via getCurrentDeltas(). Closes mycelium #141 (CR-10).

coderabbitai

Actionable comments posted: 8

♻️ Duplicate comments (1)

src/libs/consensus/petri/forge/continuousForge.ts (1)

154-173: ⚠️ Potential issue | 🟠 Major

Reset currentRoundDeltas before any early return.

If a round finds no TO_APPROVE txs or aborts before Line 203, getCurrentDeltas() keeps serving the previous round's hashes. The petri_exchangeDeltas handler currently reads that map without a round guard, so peers can record stale deltas against the new round.

🔧 Suggested fix

     async runForgeRound(): Promise<void> {
         this.state.currentRound++
         this.state.lastRoundStartedAt = Date.now()
         const round = this.state.currentRound
+        this.currentRoundDeltas = {}
 
         log.debug(`[ContinuousForge] Round ${round} starting`)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/consensus/petri/forge/continuousForge.ts` around lines 154 - 173,
The issue is stale per-round deltas persisting when runForgeRound returns early;
in runForgeRound (continuousForge.ts) reset/clear the currentRoundDeltas map at
the start of the round (or immediately before any early return) so
getCurrentDeltas() and the petri_exchangeDeltas handler never serve
previous-round hashes; update the method that maintains currentRoundDeltas
(referenced as currentRoundDeltas and getCurrentDeltas()) inside runForgeRound
to clear it (e.g., new Map() or clear()) before checking TO_APPROVE txs or
aborting.

🧹 Nitpick comments (4)

testing/loadgen/src/features/consensus/petri_tx_inclusion.ts (3)
201-206: Latency calculation may be affected by clock skew.

softFinalityAt and hardFinalityAt are server-side timestamps from the blockchain, while txSubmittedAt is a client-side Date.now(). Clock drift between client and server could produce misleading or negative latency values.

For devnet testing this is likely acceptable, but worth noting if these metrics are used for performance analysis.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` around lines
201 - 206, Latency fields softFinalityLatencyMs and hardFinalityLatencyMs use
client-side txSubmittedAt which can be skewed vs server finality timestamps
(finalityResult.finality.softFinalityAt/hardFinalityAt), causing
misleading/negative values; update the calculation in the code that builds these
fields (refer to softFinalityLatencyMs, hardFinalityLatencyMs, finalityResult,
txSubmittedAt) to either (a) compute latency using a server-side submission
timestamp if available (e.g., use finalityResult.submittedAt or a
serverTimestamp included in finalityResult) or (b) derive a clock offset by
comparing a server time value from the same response and adjust txSubmittedAt
before subtraction, and as a safeguard clamp negative results to null (or 0) to
avoid reporting negative latencies.
94-94: Consider extracting helper functions to reduce cognitive complexity.

SonarCloud reports cognitive complexity of 23, exceeding the allowed threshold of 15. The function handles setup, transaction submission, and multiple verification phases sequentially.

Consider extracting logical sections into helper functions:

submitTransaction(demos, recipientAddress, nonce, amount)

verifyInclusion(rpcUrls, txHash, senderAddress, expectedNonce, timeoutSec, pollMs)

This would improve readability and testability while reducing complexity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` at line 94, The
runPetriTxInclusion function is too cognitively complex; split its setup,
submission, and multi-phase verification logic into clear helpers: implement
submitTransaction(demos, recipientAddress, nonce, amount) to encapsulate
creating and sending the transaction and returning the txHash and any submission
metadata, and implement verifyInclusion(rpcUrls, txHash, senderAddress,
expectedNonce, timeoutSec, pollMs) to encapsulate the polling/validation across
RPC endpoints until the transaction and nonce are observed; refactor
runPetriTxInclusion to call these helpers (and additional small helpers for
initial setup or teardown if needed) so each helper has a single responsibility
and the main function just orchestrates the steps.
133-137: The as any casts are unnecessary—confirm and broadcast methods are properly typed in the SDK.

The same Demos import from @kynesyslabs/demosdk/websdk is used in testing/loadgen/src/transfer_loadgen.ts without any type casting (lines 280 and 284), which proves these methods are available in the SDK's public type definitions. Remove the casts to improve type safety.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` around lines
133 - 137, Remove the unnecessary any casts by calling the strongly-typed SDK
methods directly: replace (demos as any).confirm(signedTx) with
demos.confirm(signedTx) and (demos as any).broadcast(validity) with
demos.broadcast(validity); keep the existing variables (signedTx, validity) and
existing error check but use demos.confirm and demos.broadcast to restore type
safety and rely on the SDK's public definitions.
src/libs/blockchain/mempool_v2.ts (1)
297-300: Preserve the existing mempool tie-breakers in classification queries.

getMempool() already orders by timestamp, reference_block, then hash, but getByClassification() only sorts by timestamp. Reusing the same tie-breakers avoids nondeterministic iteration when two txs share a timestamp, especially for callers like getPreApproved() that can feed consensus flows.
♻️ Suggested change
         return await this.repo.find({
             where,
-            order: { timestamp: "ASC" },
+            order: {
+                timestamp: "ASC",
+                reference_block: "ASC",
+                hash: "ASC",
+            },
         })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/blockchain/mempool_v2.ts` around lines 297 - 300,
getByClassification() currently orders only by timestamp which can produce
nondeterministic ordering when timestamps tie; update the repository query in
getByClassification() to reuse the same tie-breaker ordering used by
getMempool() (i.e., order by timestamp, then reference_block, then hash, all
ASC) so callers like getPreApproved() see deterministic iteration. Locate
getByClassification() in mempool_v2.ts and change its this.repo.find(...)
options to include the additional order fields timestamp -> reference_block ->
hash matching getMempool().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Around line 88-93: The PETRI_... keys are triggering dotenv-linter
UnorderedKey warnings because they are not in alphabetical order; reorder the
block so the env keys are sorted lexicographically (e.g.,
PETRI_AGREEMENT_THRESHOLD, PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS,
PETRI_FORGE_INTERVAL_MS, PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE) and
update the .env.example block containing PETRI_CONSENSUS,
PETRI_FORGE_INTERVAL_MS, PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD,
PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE accordingly to resolve the linter
errors.
- Around line 36-38: Reorder the OMNI variables in .env.example so they follow
the dotenv-linter expected alphabetical order: move OMNI_MODE to come
immediately after OMNI_ENABLED and before OMNI_PORT; specifically adjust the
block containing OMNI_ENABLED, OMNI_MODE, and OMNI_PORT to read OMNI_ENABLED,
OMNI_MODE, OMNI_PORT to resolve the UnorderedKey warning.

In `@src/features/metrics/MetricsCollector.ts`:
- Around line 757-761: collectPetriMetrics() is reading
state.pendingTransactions.size which ContinuousForge never populates so
petri_pending_tx_count stays 0; replace this dead source by querying a live
pending-tx count (either add/expose a method like
ContinuousForge.getPendingTransactionCount() that returns the mempool count or
call the existing mempool/mempoolManager API directly) and use that value for
the "petri_pending_tx_count" gauge, or alternatively update
runForgeRound()/ContinuousForge to maintain state.pendingTransactions correctly
if you prefer keeping the gauge tied to the forge state.

In `@src/libs/blockchain/mempool_v2.ts`:
- Around line 117-125: The current logic leaves transactions as
TransactionClassification.TO_APPROVE when executeSpeculatively(...) returns
success: false, causing them to never reach delta agreement; update the handling
in the block that calls executeSpeculatively (the code that reads
result.classification === TransactionClassification.TO_APPROVE and assigns
deltaHash from specResult.delta) so that when specResult.success is false you do
not persist a stuck TO_APPROVE—either reject the transaction or mark it with a
terminal failure classification (e.g., TransactionClassification.TO_REJECT or a
FAILED_SIMULATION status), clear any delta hash, and persist that updated
classification immediately; ensure this mirrors how
ContinuousForge.runForgeRound() gates recording on success && delta so the
mempool won’t retain non-progressing entries.

In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 166-200: The code only fetches txs already marked
TransactionClassification.TO_APPROVE via Mempool.getByClassification so
transactions inserted without a persisted classification are never processed;
update the flow to also handle unclassified mempool rows by fetching
TransactionClassification.UNCLASSIFIED (or a combined query) alongside
TO_APPROVE, run classifyTransaction for those entries, persist the derived
classification and any computed delta via Mempool.updateClassification, and then
continue with the existing speculative execution path (symbols to edit:
Mempool.getByClassification, TransactionClassification.TO_APPROVE/UNCLASSIFIED,
classifyTransaction, executeSpeculatively, and Mempool.updateClassification).
- Around line 93-95: pause() currently only sets state.isPaused and doesn't wait
for an active round; ensure in ContinuousForge you track the currently running
round promise (e.g., add a member like currentRoundPromise or
activeRoundPromise) and assign it at the start of runForgeRound() and clear it
on completion. Update pause() to set state.isPaused then await the tracked
currentRoundPromise (if present) so it drains the in-flight round before
returning. Also update runBlockPeriod() to check/await the same tracked promise
before proceeding to arbitration/block compilation to prevent mempool rewrites
during an active round. Ensure runForgeRound() uses finally to clear the promise
so pause() and runBlockPeriod() never await a dangling promise.

In `@testing/devnet/.env.example`:
- Around line 24-29: Reorder the PETRI_* entries in the devnet .env example to
satisfy dotenv-linter's UnorderedKey rule by sorting them consistently (e.g.,
alphabetically or matching the project's canonical env order); specifically
locate the block containing PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS,
PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD,
PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE and rearrange those keys into
the agreed ordering (for example: PETRI_AGREEMENT_THRESHOLD,
PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS,
PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE) so dotenv-linter stops reporting
UnorderedKey.

In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts`:
- Around line 217-222: The thrown error in petri_tx_inclusion omits a reason
when soft finality isn't observed; update the error-reasons collection logic to
push a message when finalityResult?.softFinalityObserved is false (e.g., add
"soft finality not observed" alongside the existing checks for nonceWait,
blockAdvance, txByHash and finalityResult?.hardFinalityObserved) so the Error
thrown from petri_tx_inclusion includes that failure cause.

---

Duplicate comments:
In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 154-173: The issue is stale per-round deltas persisting when
runForgeRound returns early; in runForgeRound (continuousForge.ts) reset/clear
the currentRoundDeltas map at the start of the round (or immediately before any
early return) so getCurrentDeltas() and the petri_exchangeDeltas handler never
serve previous-round hashes; update the method that maintains currentRoundDeltas
(referenced as currentRoundDeltas and getCurrentDeltas()) inside runForgeRound
to clear it (e.g., new Map() or clear()) before checking TO_APPROVE txs or
aborting.

---

Nitpick comments:
In `@src/libs/blockchain/mempool_v2.ts`:
- Around line 297-300: getByClassification() currently orders only by timestamp
which can produce nondeterministic ordering when timestamps tie; update the
repository query in getByClassification() to reuse the same tie-breaker ordering
used by getMempool() (i.e., order by timestamp, then reference_block, then hash,
all ASC) so callers like getPreApproved() see deterministic iteration. Locate
getByClassification() in mempool_v2.ts and change its this.repo.find(...)
options to include the additional order fields timestamp -> reference_block ->
hash matching getMempool().

In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts`:
- Around line 201-206: Latency fields softFinalityLatencyMs and
hardFinalityLatencyMs use client-side txSubmittedAt which can be skewed vs
server finality timestamps
(finalityResult.finality.softFinalityAt/hardFinalityAt), causing
misleading/negative values; update the calculation in the code that builds these
fields (refer to softFinalityLatencyMs, hardFinalityLatencyMs, finalityResult,
txSubmittedAt) to either (a) compute latency using a server-side submission
timestamp if available (e.g., use finalityResult.submittedAt or a
serverTimestamp included in finalityResult) or (b) derive a clock offset by
comparing a server time value from the same response and adjust txSubmittedAt
before subtraction, and as a safeguard clamp negative results to null (or 0) to
avoid reporting negative latencies.
- Line 94: The runPetriTxInclusion function is too cognitively complex; split
its setup, submission, and multi-phase verification logic into clear helpers:
implement submitTransaction(demos, recipientAddress, nonce, amount) to
encapsulate creating and sending the transaction and returning the txHash and
any submission metadata, and implement verifyInclusion(rpcUrls, txHash,
senderAddress, expectedNonce, timeoutSec, pollMs) to encapsulate the
polling/validation across RPC endpoints until the transaction and nonce are
observed; refactor runPetriTxInclusion to call these helpers (and additional
small helpers for initial setup or teardown if needed) so each helper has a
single responsibility and the main function just orchestrates the steps.
- Around line 133-137: Remove the unnecessary any casts by calling the
strongly-typed SDK methods directly: replace (demos as any).confirm(signedTx)
with demos.confirm(signedTx) and (demos as any).broadcast(validity) with
demos.broadcast(validity); keep the existing variables (signedTx, validity) and
existing error check but use demos.confirm and demos.broadcast to restore type
safety and rely on the SDK's public definitions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3167fdf1-44b0-44ff-907e-e01c2c6bb126

📥 Commits

Reviewing files that changed from the base of the PR and between 5d2fffd and 27a1fa1.

📒 Files selected for processing (9)

.env.example
src/config/loader.ts
src/features/metrics/MetricsCollector.ts
src/libs/blockchain/mempool_v2.ts
src/libs/consensus/petri/block/petriBlockCompiler.ts
src/libs/consensus/petri/forge/continuousForge.ts
src/libs/consensus/petri/types/stateDelta.ts
testing/devnet/.env.example
testing/loadgen/src/features/consensus/petri_tx_inclusion.ts

✅ Files skipped from review due to trivial changes (1)

src/libs/consensus/petri/types/stateDelta.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/libs/consensus/petri/block/petriBlockCompiler.ts
src/config/loader.ts

.env.example

src/features/metrics/MetricsCollector.ts

src/libs/blockchain/mempool_v2.ts

src/libs/consensus/petri/forge/continuousForge.ts

testing/devnet/.env.example

testing/loadgen/src/features/consensus/petri_tx_inclusion.ts

…shes Track processed pubkeys in a dedicated Set instead of checking only the signatures map. Previously, mismatched or invalid-signature submissions were never added to signatures{} so they got re-evaluated on every poll iteration, inflating mismatchCount and potentially preventing the threshold from being reached even when enough valid signatures existed.

When petriRelay() returns success=false, the TX was silently dropped — the originating node neither relayed it nor added it to its own mempool. Now falls back to Mempool.addTransaction() so the TX is preserved for inclusion in the next block compiled locally.

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (3)

src/utilities/mainLoop.ts (1)

4-6: Consider aligning import path style.

Line 4 uses the @/ alias while lines 5-6 (and most other imports in this file) use the src/ prefix. For consistency, consider using the same path style throughout.

Suggested diff

-import { petriConsensusRoutine } from "@/libs/consensus/petri"
+import { petriConsensusRoutine } from "src/libs/consensus/petri"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utilities/mainLoop.ts` around lines 4 - 6, The imports in mainLoop.ts mix
alias styles — change the import for petriConsensusRoutine to match the
prevailing "src/" prefix (or conversely change other imports if you prefer
"@/"), so update the import line referencing petriConsensusRoutine to use the
same path style as getCommonValidatorSeed and getShard; locate the import for
petriConsensusRoutine and replace "@/libs/consensus/petri" with the consistent
"src/libs/consensus/petri" (or update the other two to
"@/libs/consensus/v2/routines/...") ensuring all imports in this file use the
same alias form.

src/libs/network/endpointExecution.ts (2)

33-33: Inconsistent import path alias.

This import uses @/libs/... while all other local imports in this file use src/libs/... (e.g., lines 1, 2, 4, 5, 16, 18, 31). Consider using a consistent path style for maintainability.

💡 Suggested fix for consistency

-import { relay as petriRelay } from "@/libs/consensus/petri/routing/petriRouter"
+import { relay as petriRelay } from "src/libs/consensus/petri/routing/petriRouter"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/network/endpointExecution.ts` at line 33, The import for petriRelay
currently uses the alias "@/libs/consensus/petri/routing/petriRouter" which is
inconsistent with the other local imports in this file; update the import to use
the same path style as the rest (e.g., change to
"src/libs/consensus/petri/routing/petriRouter") so the symbol petriRelay is
imported via the consistent path convention.

335-337: Response message is misleading after fallback.

When relaySuccess is false, the message says "relay pending", but the transaction has actually been added to the local mempool (not pending relay). Consider updating the message for accuracy.

📝 Suggested message clarification

                     message: relaySuccess
                         ? "Transaction routed to shard members"
-                        : "Transaction accepted locally (relay pending)",
+                        : "Transaction accepted locally (added to mempool)",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/network/endpointExecution.ts` around lines 335 - 337, The response
message uses relaySuccess to choose between "Transaction routed to shard
members" and "Transaction accepted locally (relay pending)" which is inaccurate
when relay fails; update the conditional that sets the message (the message
property assigned based on relaySuccess in endpointExecution.ts) so the false
branch accurately reflects local mempool acceptance — e.g., change to
"Transaction accepted locally (added to local mempool)" or similar wording to
replace "relay pending".

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 228-240: The loop in continuousForge.ts calling
Mempool.updateClassification for each txHash (setting
TransactionClassification.PRE_APPROVED and PROBLEMATIC) can race with concurrent
forge rounds or block compilation because Mempool.updateClassification uses
repo.update() without transaction isolation; change the implementation to
perform these classification changes in a single atomic operation—either add a
batched repository method (e.g., Mempool.updateClassifications or
repo.updateMany) that wraps all txHash updates in a DB transaction, or implement
optimistic locking/version checks in Mempool.updateClassification
(compare-and-swap on a version/timestamp) so concurrent rounds cannot overwrite
each other; update continuousForge.ts to call the new batched/transactional API
instead of per-tx await calls.
- Around line 127-129: getState currently returns a shallow copy of this.state
which leaks the internal pendingTransactions Map; callers can mutate the Map via
the returned object. Fix by returning a new ForgeState where pendingTransactions
is cloned (e.g., construct a new Map from this.state.pendingTransactions) and
keep other fields copied (use the existing spread of this.state but replace
pendingTransactions), so the returned object contains a fresh Map instance and
does not expose internal mutable state; update getState to return that cloned
Map version.
- Around line 280-284: Validate and sanitize peer responses before trusting the
shape used by recordDelta: when handling response.response in continuousForge
(the block assigning to peerDeltas[peer.identity]), check that response.response
is an object, that it has a deltas property which is an object, and that each
key and value in deltas is a string (or otherwise matches the expected type)
before adding it to peerDeltas; if validation fails, log/warn and skip that
peer's deltas to avoid passing malformed data into recordDelta.
- Around line 266-294: The delta exchange in continuousForge.ts currently builds
promises from peers (using peer.longCall) and awaits Promise.all, which can hang
a round if any peer call never resolves; wrap each peer.longCall (or the entire
Promise.all) in a Promise.race that rejects/resolves after an absolute timeout
(e.g., 2-3s) so dangling calls won't block the round, ensure you only write into
peerDeltas from the timed/settled responses (the promises array where
peer.longCall is invoked), and keep existing error handling (log.warn) for
timeouts so late responses are ignored and the function returns promptly with
whatever peerDeltas were gathered.
- Around line 187-201: Remove the unnecessary double-cast of mempoolTx to
Transaction: pass the existing mempoolTx (already typed as MempoolTx which
implements Transaction) directly into classifyTransaction and
executeSpeculatively instead of using "as unknown as Transaction"; keep the
existing logic that checks classResult.classification ===
TransactionClassification.TO_APPROVE, stores specResult.delta.hash into
localDeltas[mempoolTx.hash], and calls
Mempool.updateClassification(mempoolTx.hash,
TransactionClassification.TO_APPROVE, specResult.delta.hash).

In `@src/libs/network/endpointExecution.ts`:
- Around line 326-329: The fallback call to Mempool.addTransaction (using
queriedTx and validatedData.data.reference_block) lacks error handling and can
cause silent loss while the function still returns success: wrap that await in a
try/catch, log the error (including context like the transaction id or
queriedTx), and set the response/result flag to failure (e.g., result.success =
false or equivalent) - mirror the error handling approach used in the default
mempool insertion path (lines that handle errors and set result.success = false)
so failures are propagated correctly.

---

Nitpick comments:
In `@src/libs/network/endpointExecution.ts`:
- Line 33: The import for petriRelay currently uses the alias
"@/libs/consensus/petri/routing/petriRouter" which is inconsistent with the
other local imports in this file; update the import to use the same path style
as the rest (e.g., change to "src/libs/consensus/petri/routing/petriRouter") so
the symbol petriRelay is imported via the consistent path convention.
- Around line 335-337: The response message uses relaySuccess to choose between
"Transaction routed to shard members" and "Transaction accepted locally (relay
pending)" which is inaccurate when relay fails; update the conditional that sets
the message (the message property assigned based on relaySuccess in
endpointExecution.ts) so the false branch accurately reflects local mempool
acceptance — e.g., change to "Transaction accepted locally (added to local
mempool)" or similar wording to replace "relay pending".

In `@src/utilities/mainLoop.ts`:
- Around line 4-6: The imports in mainLoop.ts mix alias styles — change the
import for petriConsensusRoutine to match the prevailing "src/" prefix (or
conversely change other imports if you prefer "@/"), so update the import line
referencing petriConsensusRoutine to use the same path style as
getCommonValidatorSeed and getShard; locate the import for petriConsensusRoutine
and replace "@/libs/consensus/petri" with the consistent
"src/libs/consensus/petri" (or update the other two to
"@/libs/consensus/v2/routines/...") ensuring all imports in this file use the
same alias form.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2066d833-ec81-4566-9067-912f2120ee9f

📥 Commits

Reviewing files that changed from the base of the PR and between 27a1fa1 and 7f6ce72.

📒 Files selected for processing (6)

src/libs/blockchain/chainBlocks.ts
src/libs/consensus/petri/coordination/petriSecretary.ts
src/libs/consensus/petri/forge/continuousForge.ts
src/libs/network/endpointExecution.ts
src/libs/network/rpcDispatch.ts
src/utilities/mainLoop.ts

✅ Files skipped from review due to trivial changes (1)

src/libs/consensus/petri/coordination/petriSecretary.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/libs/network/rpcDispatch.ts
src/libs/blockchain/chainBlocks.ts

src/libs/consensus/petri/forge/continuousForge.ts

src/libs/network/endpointExecution.ts

- Remove unused beforeEach import in happyPath.test.ts - Remove redundant type annotations in byzantineFault.test.ts - Reorder OMNI_ keys alphabetically in .env.example - Add try/catch around fallback mempool insertion in endpointExecution.ts

coderabbitai

♻️ Duplicate comments (2)

.env.example (1)

88-93: ⚠️ Potential issue | 🟡 Minor

Reorder PETRI keys to satisfy dotenv-linter.

Line 90 and Line 91 still trigger UnorderedKey; this block should be lexicographically ordered.

Proposed fix

-PETRI_CONSENSUS=true
-PETRI_FORGE_INTERVAL_MS=2000
-PETRI_BLOCK_INTERVAL_MS=10000
-PETRI_AGREEMENT_THRESHOLD=7
+PETRI_AGREEMENT_THRESHOLD=7
+PETRI_BLOCK_INTERVAL_MS=10000
+PETRI_CONSENSUS=true
+PETRI_FORGE_INTERVAL_MS=2000
 PETRI_PROBLEMATIC_TTL_ROUNDS=5
 PETRI_SHARD_SIZE=10

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.env.example around lines 88 - 93, Reorder the PETRI environment keys in the
.env.example so they are lexicographically sorted to satisfy dotenv-linter:
arrange the lines with PETRI_AGREEMENT_THRESHOLD first, then
PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS,
PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE, keeping their values
unchanged.

src/libs/network/endpointExecution.ts (1)

326-343: ⚠️ Potential issue | 🟠 Major

Incomplete error handling in fallback path — transaction can still be lost with success: true returned.

The fallback path has two issues:

Return value not checked: Mempool.addTransaction returns { error } for cases like "Transaction already executed" or "Transaction already in mempool" (see lines 415-430 for the correct pattern). The current code only catches thrown exceptions.
Success always returned: Whether the mempool insertion fails via exception or returns an error, the function still returns success: true at line 339, misleading the client.

🛡️ Proposed fix to match the default mempool path error handling

             if (!relaySuccess) {
                 // Fallback: add to local mempool so the TX is not lost
                 log.warn(
                     `[handleExecuteTransaction] Petri relay failed for ${queriedTx.hash}, adding to local mempool`,
                 )
                 try {
-                    await Mempool.addTransaction({
+                    const { error } = await Mempool.addTransaction({
                         ...queriedTx,
                         reference_block: validatedData.data.reference_block,
                     })
+                    if (error) {
+                        log.error(
+                            `[handleExecuteTransaction] Fallback mempool add returned error for ${queriedTx.hash}: ${error}`,
+                        )
+                        return {
+                            success: false,
+                            response: { message: "Failed to add transaction to mempool" },
+                            extra: { error, routing: "petri" },
+                            require_reply: false,
+                        }
+                    }
                 } catch (mempoolError) {
                     log.error(
                         `[handleExecuteTransaction] Fallback mempool insertion also failed for ${queriedTx.hash}: ${mempoolError instanceof Error ? mempoolError.message : String(mempoolError)}`,
                     )
+                    return {
+                        success: false,
+                        response: { message: "Failed to add transaction to mempool" },
+                        extra: { error: String(mempoolError), routing: "petri" },
+                        require_reply: false,
+                    }
                 }
             }

             return {
                 success: true,
                 response: {
                     message: relaySuccess
                         ? "Transaction routed to shard members"
-                        : "Transaction accepted locally (relay pending)",
+                        : "Transaction accepted locally (added to mempool)",
                 },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/libs/network/endpointExecution.ts` around lines 326 - 343, The fallback
mempool insertion currently only catches thrown exceptions and always returns
success: true; update the fallback where Mempool.addTransaction is called (the
block inside handleExecuteTransaction that logs `[handleExecuteTransaction]
Fallback mempool insertion...`) to inspect the returned value from
Mempool.addTransaction (not just exceptions), mirror the pattern used in the
default mempool path (check for a returned { error } and handle "Transaction
already executed" / "Transaction already in mempool" cases), and if an error is
returned set success: false and return an appropriate response payload (or
propagate the error) instead of unconditionally returning success: true.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.env.example:
- Around line 88-93: Reorder the PETRI environment keys in the .env.example so
they are lexicographically sorted to satisfy dotenv-linter: arrange the lines
with PETRI_AGREEMENT_THRESHOLD first, then PETRI_BLOCK_INTERVAL_MS,
PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS, PETRI_PROBLEMATIC_TTL_ROUNDS, and
PETRI_SHARD_SIZE, keeping their values unchanged.

In `@src/libs/network/endpointExecution.ts`:
- Around line 326-343: The fallback mempool insertion currently only catches
thrown exceptions and always returns success: true; update the fallback where
Mempool.addTransaction is called (the block inside handleExecuteTransaction that
logs `[handleExecuteTransaction] Fallback mempool insertion...`) to inspect the
returned value from Mempool.addTransaction (not just exceptions), mirror the
pattern used in the default mempool path (check for a returned { error } and
handle "Transaction already executed" / "Transaction already in mempool" cases),
and if an error is returned set success: false and return an appropriate
response payload (or propagate the error) instead of unconditionally returning
success: true.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a7cbcfc-b840-407c-ace9-09e54704d172

📥 Commits

Reviewing files that changed from the base of the PR and between 7f6ce72 and 6682606.

📒 Files selected for processing (4)

.env.example
src/libs/network/endpointExecution.ts
testing/petri/byzantineFault.test.ts
testing/petri/happyPath.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

testing/petri/happyPath.test.ts
testing/petri/byzantineFault.test.ts

Pass forgeStartedAt timestamp to runBlockPeriod and sleep only the remainder of blockIntervalMs after subtracting forge startup overhead. Prevents cumulative drift across rounds.

Replace hard-coded 5s ceiling with petriConfig.blockIntervalMs so the wait scales with the configured block interval.

Replace naive shard[0] stub with production-matching electSecretary() that sorts all identities alphabetically. Update test data to use identities that produce deterministic sorted order.

Wrap Promise.all in Promise.race with forgeIntervalMs timeout to prevent round stalls from slow/dead peers. Also validate that peer response deltas is a non-array object before using it.

Log a specific reason for soft finality failure alongside the existing hard finality reason in petri_tx_inclusion error output.

Previously, if executeSpeculatively() failed, the TX stayed classified as TO_APPROVE with no deltaHash, causing it to be stuck in the mempool forever. Now marks it as FAILED so the forge skips it.

Add drain() method to ContinuousForge that pauses and awaits any currently running round. Track the round promise via currentRoundPromise. Replace forge.pause() with await forge.drain() in runBlockPeriod so block compilation never races with an in-flight forge round.

TXs arriving via mergeMempools from shard peers have no classification. Add Mempool.getUnclassified() (IsNull query) and classify+speculate them at the top of each forge round before querying TO_APPROVE rows. This ensures merged TXs participate in delta agreement.

Satisfies S6544 — avoid truthy evaluation on a Promise type.

github-actions · 2026-03-24T13:38:22Z

⚠️ MCP Memory Files Detected

This PR modifies .serena/ files. After merge, these changes will be automatically reverted to preserve branch-specific MCP memories.

Files that will be reverted:

.serena/project.yml

sonarqubecloud · 2026-03-24T13:39:54Z

Quality Gate passed

Issues
93 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
1.9% Duplication on New Code

See analysis details on SonarQube Cloud

tcsenpai added 30 commits March 16, 2026 17:18

petri pitch

8c83db5

Merge remote-tracking branch 'origin/stabilisation' into petri

beffddd

petri: add Phase 8 (Soft Finality SDK Endpoint) to plan

d714c16

New phase for exposing soft finality to SDK consumers. Marked as requiring user coordination before touching ../sdks/.

petri: update architecture diagram for Phase 1

ca7d45c

Add Phase 1 data flow, new module boxes (classifier, speculativeExecutor, canonicalJson), feature flag gate visualization, and modified module entries (Mempool entity, mempool_v2). Phase 0 content preserved.

petri: update architecture diagram for Phase 2

807e3c9

Add Continuous Forge cyclic flow diagram, DeltaAgreementTracker detail, forgeInstance/RPC bridge, and complete round data flow. All P0/P1/P2 components shown with phase annotations.

petri: update architecture diagram for Phase 3

2f5740e

petri: update architecture diagram for Phase 4

7c1de05

petri: add reflexion review comments for P3/P4 clarity

9f29730

- BFTArbitrator: document roundNumber: -1 sentinel reuse of petri_exchangeDeltas - endpointExecution: flag Petri routing early-return mempool flow for P6 verification

petri: update architecture diagram for Phase 5

f3d5ea7

petri: guard soft_finality_at to preserve first-set semantics

c5baadb

Only set soft_finality_at on first PRE_APPROVED classification, not on subsequent updates. Prevents timestamp overwrite if a tx is re-classified.

petri: update architecture diagram for Phase 6

e72c6b9

petri: update architecture diagram for Phase 7

334426a

petri: add missing @deprecated marker on getBlockTimestamp RPC

7892972

Reflexion finding: the getBlockTimestamp case block lacked its own deprecation comment — only covered indirectly by the getValidatorPhase group comment. Added direct marker for clarity.

petri: integrate Petri tests into better_testing harness

e59090b

- Add test:petri script to package.json (bun run test:petri) - Register Petri consensus in TESTING_MAP.md coverage map

petri: add devnet scenarios for block production and tx inclusion

e63b034

Register petri_block_production and petri_tx_inclusion scenarios in loadgen main.ts so they are discoverable via run-scenario.sh and the petri suite in run-suite.ts.

petri: add relay flow e2e devnet scenario

64bbc50

Submits TX to node-1 and verifies propagation to all other nodes via getTx and getTransactionFinality, ensuring Petri consensus relay layer works across the full cluster.

petri: add Petri consensus metrics to MetricsCollector

c45d87a

Registers and collects petri_enabled, petri_forge_running, petri_forge_paused, petri_forge_round, petri_pending_tx_count, and petri_tracker_tx_count gauges for Prometheus observability.

petri: add soak run + performance baseline scenario

6eaedbf

Sends sustained load over configurable rounds, measuring TX throughput, soft/hard finality latency (p50/p95/p99), block production rate, and error rate. Outputs baseline JSON summary.

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

tcsenpai added 11 commits March 23, 2026 15:32

docs: add missing Petri config keys to .env.example files

1f4c3ee

Add PETRI_AGREEMENT_THRESHOLD, PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE with their defaults so operators can discover all tuning knobs. Closes mycelium #148 (Q-6).

docs: fix StateDelta hash comment to reflect normalized field set

6c6872a

The hash is computed over normalized edits ({type, operation, account, amount}), not the full GCREdit. Closes mycelium #146 (CR-9).

fix: require softFinalityObserved in petri_tx_inclusion test

dc02d10

The ok boolean required hardFinalityObserved but not softFinalityObserved, despite the scenario claiming to verify softFinalityAt. Closes mycelium #145 (CR-19).

fix: validate Petri config invariants before freezing

62a7a7a

Reject invalid Petri config values (non-positive intervals, shardSize, agreementThreshold out of range, negative TTL) at startup instead of silently breaking consensus. Closes mycelium #147 (CR-2).

fix: use strict undefined check for blockNumber in mempool queries

aaa561c

blockNumber=0 (genesis) is a valid reference but was filtered out by truthy check. Change to !== undefined in both getAll() and getByClassification(). Closes mycelium #142 (Q-5).

fix: deduplicate transactions in petriBlockCompiler before ordering

27a1fa1

A resolved TX from arbitration could still be in the mempool, causing duplicates in the block. Use a hash-keyed Map to deduplicate before ordering. Closes mycelium #144 (CR-4).

fix: reset startingConsensus on preflight error via try/finally

4dbd532

If getCommonValidatorSeed() or getShard() threw before entering petriConsensusRoutine, startingConsensus stayed true permanently, blocking all future consensus rounds. Wrap in try/finally. Closes mycelium #136 (CR-17).

fix: only remove committed TXs from mempool after block insertion

f1fc78f

Mempool cleanup used transactionEntities (all TXs) instead of committedTxHashes (only successfully inserted). Skipped TXs were silently removed from mempool despite not being persisted. Closes mycelium #138 (Q-3).

fix: add try/catch to getTransactionFinality RPC handler

73a6863

Unhandled exceptions from getTransactionFinality() propagated without a structured error response, unlike neighboring RPC cases. Closes mycelium #139 (CR-16).

fix: clear currentRoundDeltas at start of each forge round

ab5b63d

If a round had no TO_APPROVE txs and returned early, stale deltas from the previous round were served via getCurrentDeltas(). Closes mycelium #141 (CR-10).

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

tcsenpai added 2 commits March 23, 2026 18:18

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

tcsenpai added 10 commits March 24, 2026 14:17

fix: align block wait to global boundary instead of fixed sleep

d41a902

Pass forgeStartedAt timestamp to runBlockPeriod and sleep only the remainder of blockIntervalMs after subtracting forge startup overhead. Prevents cumulative drift across rounds.

fix: use Petri config-driven wait window in ensureCandidateBlockFormed

b155aec

Replace hard-coded 5s ceiling with petriConfig.blockIntervalMs so the wait scales with the configured block interval.

fix: use real election logic in secretaryCoordination test suite

87d2304

Replace naive shard[0] stub with production-matching electSecretary() that sorts all identities alphabetically. Update test data to use identities that produce deterministic sorted order.

fix: add timeout on delta exchange and validate peer response structure

784ddcc

Wrap Promise.all in Promise.race with forgeIntervalMs timeout to prevent round stalls from slow/dead peers. Also validate that peer response deltas is a non-array object before using it.

fix: add failure reason when soft finality is not observed

353644b

Log a specific reason for soft finality failure alongside the existing hard finality reason in petri_tx_inclusion error output.

fix: mark TX as FAILED when speculative execution fails

7f07a0b

Previously, if executeSpeculatively() failed, the TX stayed classified as TO_APPROVE with no deltaHash, causing it to be stuck in the mempool forever. Now marks it as FAILED so the forge skips it.

fix: use explicit null check for currentRoundPromise in drain()

1db79d7

Satisfies S6544 — avoid truthy evaluation on a Promise type.

ignores and memory

de41db8

Conversation

tcsenpai commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

sonarqubecloud bot commented Mar 24, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tcsenpai commented Mar 23, 2026 •

edited by coderabbitai bot

Loading