DF-23489 Bump chainlink-framework for multinode & metrics#432
Conversation
8035760 to
7cf3425
Compare
This comment was marked as outdated.
This comment was marked as outdated.
7cf3425 to
ea21630
Compare
use new PollSuccessThreshold conf (defaults to same behavior) controls new polling checks on unreachableLoop
ea21630 to
6695b4b
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the EVM node-pool configuration surface to support the framework’s new “poll success threshold” behavior for bringing previously-unreachable nodes back to “alive”, alongside bumping chainlink-framework/* dependencies.
Changes:
- Bump
github.com/smartcontractkit/chainlink-framework/{capabilities,chains,metrics,multinode}to a newer pseudo-version. - Add
NodePool.PollSuccessThresholdto TOML config structs, defaults, docs, and config interfaces (plus tests). - Minor adjustment to
RPCClient.SendTransactiontiming start placement.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/config/toml/testdata/config-full.toml | Adds NodePool.PollSuccessThreshold to full TOML test fixture. |
| pkg/config/toml/docs.toml | Documents new PollSuccessThreshold setting in TOML docs template. |
| pkg/config/toml/defaults/fallback.toml | Sets default PollSuccessThreshold = 0 in fallback defaults. |
| pkg/config/toml/defaults/Shibarium_Testnet.toml | Adds PollSuccessThreshold = 0 in chain defaults. |
| pkg/config/toml/defaults/Hashkey_Mainnet.toml | Adds PollSuccessThreshold = 0 in chain defaults. |
| pkg/config/toml/defaults/Ethereum_Holesky.toml | Adds PollSuccessThreshold = 0 in chain defaults. |
| pkg/config/toml/defaults/Blast_Sepolia.toml | Adds PollSuccessThreshold = 0 in chain defaults. |
| pkg/config/toml/defaults/Blast_Mainnet.toml | Adds PollSuccessThreshold = 0 in chain defaults. |
| pkg/config/toml/config_test.go | Updates expected parsed TOML config struct to include PollSuccessThreshold. |
| pkg/config/toml/config.go | Extends TOML NodePool struct + merge logic to include PollSuccessThreshold. |
| pkg/config/config_test.go | Asserts PollSuccessThreshold() value in chain-scoped config test. |
| pkg/config/config.go | Extends config.NodePool interface with PollSuccessThreshold(). |
| pkg/config/chain_scoped_node_pool.go | Implements PollSuccessThreshold() on NodePoolConfig. |
| pkg/client/rpc_client.go | Moves start := time.Now() to avoid timing a branch that returns early. |
| pkg/client/helpers_test.go | Extends test NodePool config stub to implement PollSuccessThreshold(). |
| pkg/client/evm_client_test.go | Updates NewClientConfigs callsites for added parameter. |
| pkg/client/config_builder_test.go | Verifies PollSuccessThreshold is plumbed through config builder. |
| pkg/client/config_builder.go | Adds pollSuccessThreshold parameter and wires it into toml.NodePool. |
| go.sum | Updates sums for bumped chainlink-framework/* module versions. |
| go.mod | Bumps chainlink-framework/* required versions. |
| CONFIG.md | Adds user-facing docs for NodePool.PollSuccessThreshold. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| nodePool := toml.NodePool{ | ||
| SelectionMode: selectionMode, | ||
| LeaseDuration: commonconfig.MustNewDuration(leaseDuration), | ||
| PollFailureThreshold: pollFailureThreshold, | ||
| PollSuccessThreshold: pollSuccessThreshold, | ||
| PollInterval: commonconfig.MustNewDuration(pollInterval), | ||
| SyncThreshold: syncThreshold, |
There was a problem hiding this comment.
NewClientConfigs accepts pollSuccessThreshold as a *uint32 but does not default/validate it before storing it in toml.NodePool. Since NodePoolConfig.PollSuccessThreshold() dereferences this pointer, passing nil here will lead to a runtime panic when the node pool reads the value. Consider either (a) treating nil as the default (e.g., 0 / previous behavior) inside NewClientConfigs, or (b) returning an error if it is nil (and documenting it as required).
| return *n.C.PollFailureThreshold | ||
| } | ||
|
|
||
| func (n *NodePoolConfig) PollSuccessThreshold() uint32 { |
There was a problem hiding this comment.
PollSuccessThreshold() blindly dereferences n.C.PollSuccessThreshold. Because this is a newly introduced optional config field, older in-memory configs (or external constructors) may leave it nil, causing a panic. If the intended default is “previous behavior”, consider returning 0 when the pointer is nil (similar to ExternalRequestMaxResponseSize()), or ensure validation guarantees the pointer is always set before this config is used.
| func (n *NodePoolConfig) PollSuccessThreshold() uint32 { | |
| func (n *NodePoolConfig) PollSuccessThreshold() uint32 { | |
| if n.C.PollSuccessThreshold == nil { | |
| return 0 | |
| } |
| PollFailureThreshold = 5 # Default | ||
| # PollSuccessThreshold indicates how many consecutive polls must succeed in order to mark a node as alive once it has been marked as unreachable. | ||
| # | ||
| # Set to zero to require no successful polls (previous behavior). |
There was a problem hiding this comment.
The description for PollSuccessThreshold is ambiguous: “Set to zero to require no successful polls” can be read as the node becoming alive immediately (without waiting for any successful poll), which likely isn’t the intent. Consider rewording to explicitly state the behavior for 0 (e.g., “mark alive after the first successful poll / disable the success-threshold check”) to avoid misconfiguration.
| # Set to zero to require no successful polls (previous behavior). | |
| # Set to zero to disable the success-threshold check; in this case, the node is marked alive again after the first successful poll (previous behavior). |
| ``` | ||
| PollSuccessThreshold indicates how many consecutive polls must succeed in order to mark a node as alive once it has been marked as unreachable. | ||
|
|
||
| Set to zero to require no successful polls (previous behavior). |
There was a problem hiding this comment.
The PollSuccessThreshold docs here have the same ambiguity as docs.toml: “Set to zero to require no successful polls” can be interpreted as immediately marking a node alive. Consider clarifying what 0 actually does (e.g., “mark alive after the first successful poll / disable the consecutive-success requirement”) so operators don’t accidentally choose an unsafe value.
| Set to zero to require no successful polls (previous behavior). | |
| Set to zero to disable the consecutive-success requirement, so a node marked as unreachable will be marked alive again after the first successful poll (previous behavior). |
chainlink-framework/*multinodenow eventually marks unreachable nodes that sustain error rates above %50 (whereas previously only whenPollFailureThresholdpolls failed in a row);PollSuccessThresholdconfiguration property.unreachableLoopAlivecriteria