feat(peer): provisional peer verification subsystem#143
Open
adequatelimited wants to merge 1 commit intomasterfrom
Open
feat(peer): provisional peer verification subsystem#143adequatelimited wants to merge 1 commit intomasterfrom
adequatelimited wants to merge 1 commit intomasterfrom
Conversation
Introduces a provisional peer list that holds unverified IP addresses received from network peers. A background thread verifies candidates by attempting handshakes, and only verified peers are promoted to the active recent peer list (Rplist). Includes source reputation tracking with time-windowed decay to mitigate IP flooding attacks while tolerating the stale peer lists common on the existing network. New in types.h: PROVPEER struct, configuration defines, status values New in peer.h: function prototypes for provisional peer management New in peer.c: addprovisional(), harvest_provisional(), source reputation logic, background verification thread, purge Modified network.c: scan_quorum() and refresh_ipl() now route received peer IPs through addprovisional() instead of addrecent() Modified mochimo.c: thread lifecycle (start/harvest/stop) integrated into server init, main loop, and shutdown New test: src/test/peer-provisional.c (make test-peer-provisional)
Collaborator
Author
|
Here's the placeholder PR for this new feature. Will revisit it after the remaining audit-fixes are complete. @chrisdigity Would love your input on this. |
Collaborator
Author
|
Note: Clearing EXPIRED status items from Provisional may contradict the reputation management threshold calculation. If they are cleared immediately, they won't be available for us to use to calculate that a node is a bad actor. Some re-work is needed there to determine when someone has a "bad' reputation, but the bulk of the feature is here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Architectural Overview: Provisional Peer Verification
Background: What We Had Before
The Mochimo node maintains a Recent Peer List (Rplist, 64 entries) used for all network operations -- peer discovery, quorum formation, block propagation, and chain synchronization. Previously, when the node received a peer list from another node via OP_SEND_IPL, those IP addresses were added directly to Rplist via addrecent() -- no verification that the IPs were actually running Mochimo nodes.
This created two problems:
Stale peer propagation: Nodes that went offline months ago remain in peer lists indefinitely. Every node shares its Rplist with every peer that asks, so stale IPs propagate across the entire network. A significant portion of advertised peers on the current network are unreachable.
IP flooding attack surface: A malicious node could respond to OP_GET_IPL with fabricated IP addresses, filling the requesting node's Rplist with garbage. The node would then waste time trying to contact unreachable IPs during quorum formation and sync operations, and would propagate those garbage IPs to other nodes that ask for its peer list.
What Changed
Peer IPs received from network responses now go through a provisional verification pipeline before being added to Rplist. The pipeline has three stages:
Stage 1 -- Intake (addprovisional): IPs from OP_SEND_IPL responses are placed in a provisional list (4096 entries) instead of Rplist. Each entry records the candidate IP, the source IP that advertised it, and a status field. Before appending, the function deduplicates against existing provisional entries and Rplist, and checks the source's reputation.
Stage 2 -- Verification (background thread): A dedicated thread processes provisional entries in batches of 32. For each pending entry whose retry time has passed, it attempts a callserver() handshake. If the handshake succeeds, the entry is marked VERIFIED. If it fails, the fail counter increments and the next retry is scheduled with exponential backoff. After 5 failures, the entry is marked EXPIRED.
Stage 3 -- Harvest (harvest_provisional): Called periodically from the main server loop. Scans for VERIFIED entries, promotes them to Rplist via addrecent(), then compacts the list by removing all EXPIRED entries.
Race Condition Handling
The provisional list is protected by a RWLock (from the extended-c threading library):
The verification thread checks Running and Provrunning flags between every operation and every sleep second, ensuring clean shutdown without deadlock.
Blocking Situation Analysis
Source Reputation Management
When addprovisional() evaluates whether to accept an IP from a given source, it tallies that source's track record from existing provisional entries:
Time-windowed decay: The reputation check only considers failures from the last hour. This is critical because:
Tunable Parameters
All defined in types.h alongside existing peer configuration:
Behavior Under Normal Conditions
Behavior Under IP Flooding Attack
A malicious node responds to OP_GET_IPL with 64 fabricated IPs:
Impact on node operation: Zero. Rplist is never polluted.
Behavior With Stale Network Peer Lists
Files Changed
Testing
Unit test (make test-peer-provisional): Tests basic add, deduplication against provisional list and Rplist, capacity limit (4096 entries), source reputation with good sources, purge, multiple sources with cross-source dedup, harvest compaction, rapid add/harvest cycles (100 iterations), thread start/stop lifecycle, and concurrent add + harvest from separate threads. All assertions use the standard _assert.h framework. Passes via make test and is included in make coverage.
Build verification: Clean compile with -Wall -Werror -Wextra -Wpedantic on GCC 13 (Ubuntu x64). All existing tests unaffected.
What This Does NOT Change