chore: improve nullify in registry#2345
Conversation
📝 WalkthroughWalkthroughThis PR implements a v2 nullify instruction format with proof node pairing optimization. It adds maximum cardinality matching for batching state nullify instructions, introduces a new Changes
Sequence DiagramsequenceDiagram
participant Forester as Forester V1<br/>Processor
participant Builder as Transaction<br/>Builder
participant Matcher as Pairing<br/>Matcher
participant Registry as Registry<br/>Program
Forester->>Builder: fetch_proofs_and_create_instructions()<br/>(prepared instructions)
Builder->>Builder: count state_nullify instances
Builder->>Builder: should_attempt_pairing()<br/>(check block height, RPC limits)
alt Pairing Allowed
Builder->>Matcher: pair_state_nullify_batches()<br/>(maximum cardinality matching)
Matcher->>Matcher: build pairing graph<br/>(transaction size feasibility)
Matcher->>Matcher: compute max matching
Matcher->>Builder: paired + unpaired batches
Builder->>Builder: build_instruction_batches()<br/>(AddressUpdate + paired StateNullify)
else Pairing Disabled
Builder->>Builder: build_instruction_batches()<br/>(flat instruction sequence)
end
Builder->>Registry: create transactions<br/>(nullify_2 instructions)
Registry->>Registry: validate_nullify_2_inputs()
Registry->>Registry: extract_proof_nodes_from_remaining_accounts()
Registry->>Registry: process_nullify()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The heterogeneous nature of changes across processor logic, transaction batching, program-side validation, and instruction factory patterns—combined with the introduction of graph-based matching algorithms and wrapper type refactoring—requires careful reasoning about integration points, proof handling correctness, and matching algorithm soundness. Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable poems in the walkthrough.Disable the |
|
Mechanism: Each nullify has 16 proof accounts (pubkeys). Algorithm: Build a graph of all state nullifies. Guards: Pairing only if batch_size >= 2. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
forester/src/processor/v1/helpers.rs (1)
202-205:⚠️ Potential issue | 🟡 MinorRetry count in failure text is off by one.
The loop executes
MAX_RETRIES + 1attempts (0..=MAX), but the emitted error strings report onlyMAX_RETRIES.🧩 Suggested fix
- "Failed to get address proofs for batch {} after {} retries: {}", + "Failed to get address proofs for batch {} after {} attempts: {}", batch_idx, - ADDRESS_PROOF_MAX_RETRIES, + ADDRESS_PROOF_MAX_RETRIES + 1, e )); ... - "Failed to get state proofs after {} retries: {}", - ADDRESS_PROOF_MAX_RETRIES, + "Failed to get state proofs after {} attempts: {}", + ADDRESS_PROOF_MAX_RETRIES + 1, e ));Also applies to: 332-334
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/processor/v1/helpers.rs` around lines 202 - 205, The failure log messages that print the retry count use ADDRESS_PROOF_MAX_RETRIES but the loop does 0..=ADDRESS_PROOF_MAX_RETRIES (i.e., MAX+1 attempts), so update the logged count to reflect the actual number of attempts (e.g., use ADDRESS_PROOF_MAX_RETRIES + 1 or log an explicit attempts variable). Locate the log call that contains the string "Failed to get address proofs for batch {} after {} retries: {}" and replace the second argument with ADDRESS_PROOF_MAX_RETRIES + 1 (or the attempts counter), and make the same change for the other occurrence around the "332-334" area so both messages report the correct attempt count.forester/src/processor/v1/tx_builder.rs (1)
191-193:⚠️ Potential issue | 🟠 MajorAvoid swallowing generic “not found” errors.
contains("not found")is too broad and can mask unrelated upstream failures. Restrict this path to explicit missing-proof/indexer signals only.🔧 Suggested fix
- return if err_str.to_lowercase().contains("record not found") - || err_str.to_lowercase().contains("not found") + let lower = err_str.to_lowercase(); + return if lower.contains("record not found") { warn!("Record not found in indexer, skipping batch: {}", e); // Return empty transactions but don't propagate the error Ok((vec![], last_valid_block_height)) } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/processor/v1/tx_builder.rs` around lines 191 - 193, The current check uses err_str.contains("not found"), which is too broad; update the error handling in tx_builder (the block using err_str) to match only explicit missing-proof/indexer signals instead of generic substrings: remove the generic contains("not found") clause and either match on concrete error variants/enums returned by your indexer/proof APIs (e.g., ProofNotFound, IndexerError::NotFound) or check for specific, exact error strings like "record not found" or the exact message your proof/indexer layers emit; ensure you use pattern matching on the error type where possible (or exact equals on err_str) so unrelated upstream "not found" messages aren’t swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@forester/docs/v1_forester_flows.md`:
- Line 5: Several fenced code blocks in v1_forester_flows.md are missing
language identifiers (plain triple-backtick blocks at the shown diffs and also
at the other locations noted), which triggers markdownlint MD040; update each
triple-backtick block so it starts with a language tag (e.g., ```text or ```yaml
or ```json as appropriate) instead of bare ```, keeping the existing block
content unchanged, specifically adjust the opening fences for the blocks shown
in the diff and the additional occurrences around the other noted locations
(lines referenced in the review) so every code fence includes a language
identifier.
- Line 67: The docs reference the wrong SDK function name: replace
create_nullify_with_proof_accounts_instruction() with the actual v2 function
name create_nullify_2_instruction in the flow and ensure any mentioned
parameters/returns match the signature of create_nullify_2_instruction;
cross-check the helper implementation (create_nullify_2_instruction) and update
the documentation line so it accurately reflects the real function name and its
parameter/return names.
- Around line 105-106: The docs values for pair_candidates and
state_nullify_count are stale; update the documentation to match the actual
enforced constants in code by replacing the documented limits with
MAX_PAIR_CANDIDATES = 4_950 and MAX_PAIRING_INSTRUCTIONS = 100 (the constants
named MAX_PAIR_CANDIDATES and MAX_PAIRING_INSTRUCTIONS in tx_builder.rs), or
alternatively change those constants if you intend the docs to be the source of
truth—ensure the numbers in the docs and the values of the constants stay
identical.
In `@forester/src/processor/v1/tx_builder.rs`:
- Around line 385-388: The current logic sorts paired_batches and single_batches
separately then extends paired_batches with single_batches, which can produce a
non-globally-sorted result; change the code so that after combining the two
lists (e.g. after paired_batches.extend(single_batches)) you perform a single
global sort_by_key on paired_batches by the leaf key (the same key used in
paired_batches.sort_by_key and single_batches.sort_by_key), then map/collect the
batches as before (the paired_batches.into_iter().map(|(_, batch)|
batch).collect()) to guarantee deterministic global leaf ordering.
---
Outside diff comments:
In `@forester/src/processor/v1/helpers.rs`:
- Around line 202-205: The failure log messages that print the retry count use
ADDRESS_PROOF_MAX_RETRIES but the loop does 0..=ADDRESS_PROOF_MAX_RETRIES (i.e.,
MAX+1 attempts), so update the logged count to reflect the actual number of
attempts (e.g., use ADDRESS_PROOF_MAX_RETRIES + 1 or log an explicit attempts
variable). Locate the log call that contains the string "Failed to get address
proofs for batch {} after {} retries: {}" and replace the second argument with
ADDRESS_PROOF_MAX_RETRIES + 1 (or the attempts counter), and make the same
change for the other occurrence around the "332-334" area so both messages
report the correct attempt count.
In `@forester/src/processor/v1/tx_builder.rs`:
- Around line 191-193: The current check uses err_str.contains("not found"),
which is too broad; update the error handling in tx_builder (the block using
err_str) to match only explicit missing-proof/indexer signals instead of generic
substrings: remove the generic contains("not found") clause and either match on
concrete error variants/enums returned by your indexer/proof APIs (e.g.,
ProofNotFound, IndexerError::NotFound) or check for specific, exact error
strings like "record not found" or the exact message your proof/indexer layers
emit; ensure you use pattern matching on the error type where possible (or exact
equals on err_str) so unrelated upstream "not found" messages aren’t swallowed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d8517da-43e7-4704-8b60-21dd1a233c65
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockand included by noneprogram-tests/registry-test/tests/nullify_2_regression.rsis excluded by none and included by none
📒 Files selected for processing (8)
forester/Cargo.tomlforester/docs/v1_forester_flows.mdforester/src/processor/v1/helpers.rsforester/src/processor/v1/tx_builder.rsprograms/registry/src/account_compression_cpi/nullify.rsprograms/registry/src/account_compression_cpi/sdk.rsprograms/registry/src/errors.rsprograms/registry/src/lib.rs
|
|
||
| ## 1. Transaction Send Flow (Blockhash) | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
These blocks trigger markdownlint MD040.
🛠 Suggested patch
-```
+```text
...
-```
+```
-```
+```text
...
-```
+```
-```
+```text
...
-```
+```
-```
+```text
...
-```
+```Also applies to: 59-59, 87-87, 131-131
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 5-5: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/docs/v1_forester_flows.md` at line 5, Several fenced code blocks in
v1_forester_flows.md are missing language identifiers (plain triple-backtick
blocks at the shown diffs and also at the other locations noted), which triggers
markdownlint MD040; update each triple-backtick block so it starts with a
language tag (e.g., ```text or ```yaml or ```json as appropriate) instead of
bare ```, keeping the existing block content unchanged, specifically adjust the
opening fences for the blocks shown in the diff and the additional occurrences
around the other noted locations (lines referenced in the review) so every code
fence includes a language identifier.
| LEGACY (proof in ix data) v2 (proof in remaining_accounts) | ||
| ─────────────────────── ──────────────────────────────────── | ||
|
|
||
| create_nullify_instruction() create_nullify_with_proof_accounts_instruction() |
There was a problem hiding this comment.
Use the actual v2 SDK function name in this flow.
This line references create_nullify_with_proof_accounts_instruction(), but the implementation path uses create_nullify_2_instruction (see forester/src/processor/v1/helpers.rs, Line 391).
As per coding guidelines "Cross-reference any mentioned function names, parameters, return types, and error conditions with the source code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/docs/v1_forester_flows.md` at line 67, The docs reference the wrong
SDK function name: replace create_nullify_with_proof_accounts_instruction() with
the actual v2 function name create_nullify_2_instruction in the flow and ensure
any mentioned parameters/returns match the signature of
create_nullify_2_instruction; cross-check the helper implementation
(create_nullify_2_instruction) and update the documentation line so it
accurately reflects the real function name and its parameter/return names.
| │ - pair_candidates = n*(n-1)/2 <= 2000 (MAX_PAIR_CANDIDATES) | ||
| │ - state_nullify_count <= 96 (MAX_PAIRING_INSTRUCTIONS) |
There was a problem hiding this comment.
Pairing limits in docs are out of sync with code.
The docs state <= 2000 candidates and <= 96 instructions, while forester/src/processor/v1/tx_builder.rs currently enforces MAX_PAIR_CANDIDATES = 4_950 and MAX_PAIRING_INSTRUCTIONS = 100.
As per coding guidelines "Ensure documentation describes the actual behavior, not outdated or planned behavior".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/docs/v1_forester_flows.md` around lines 105 - 106, The docs values
for pair_candidates and state_nullify_count are stale; update the documentation
to match the actual enforced constants in code by replacing the documented
limits with MAX_PAIR_CANDIDATES = 4_950 and MAX_PAIRING_INSTRUCTIONS = 100 (the
constants named MAX_PAIR_CANDIDATES and MAX_PAIRING_INSTRUCTIONS in
tx_builder.rs), or alternatively change those constants if you intend the docs
to be the source of truth—ensure the numbers in the docs and the values of the
constants stay identical.
| paired_batches.sort_by_key(|(leaf, _)| *leaf); | ||
| single_batches.sort_by_key(|(leaf, _)| *leaf); | ||
| paired_batches.extend(single_batches); | ||
| Ok(paired_batches.into_iter().map(|(_, batch)| batch).collect()) |
There was a problem hiding this comment.
Final batch order is not globally leaf-sorted.
You sort paired and single batches separately, then append singles; this can break deterministic global order by leaf index.
🔁 Suggested fix
- paired_batches.sort_by_key(|(leaf, _)| *leaf);
- single_batches.sort_by_key(|(leaf, _)| *leaf);
- paired_batches.extend(single_batches);
- Ok(paired_batches.into_iter().map(|(_, batch)| batch).collect())
+ let mut all_batches = Vec::with_capacity(paired_batches.len() + single_batches.len());
+ all_batches.extend(paired_batches);
+ all_batches.extend(single_batches);
+ all_batches.sort_by_key(|(leaf, _)| *leaf);
+ Ok(all_batches.into_iter().map(|(_, batch)| batch).collect())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/src/processor/v1/tx_builder.rs` around lines 385 - 388, The current
logic sorts paired_batches and single_batches separately then extends
paired_batches with single_batches, which can produce a non-globally-sorted
result; change the code so that after combining the two lists (e.g. after
paired_batches.extend(single_batches)) you perform a single global sort_by_key
on paired_batches by the leaf key (the same key used in
paired_batches.sort_by_key and single_batches.sort_by_key), then map/collect the
batches as before (the paired_batches.into_iter().map(|(_, batch)|
batch).collect()) to guarantee deterministic global leaf ordering.
| pub fn nullify_2<'info>( | ||
| ctx: Context<'_, '_, '_, 'info, NullifyLeaves<'info>>, | ||
| bump: u8, | ||
| change_log_indices: Vec<u64>, |
There was a problem hiding this comment.
| change_log_indices: Vec<u64>, | |
| first_change_log_index: u16, |
the second one should be first_change_log_index + 1 % changelog_length
| ctx: Context<'_, '_, '_, 'info, NullifyLeaves<'info>>, | ||
| bump: u8, | ||
| change_log_indices: Vec<u64>, | ||
| leaves_queue_indices: Vec<u16>, |
There was a problem hiding this comment.
| leaves_queue_indices: Vec<u16>, | |
| leaves_queue_indices: [u16;2], |
| bump: u8, | ||
| change_log_indices: Vec<u64>, | ||
| leaves_queue_indices: Vec<u16>, | ||
| indices: Vec<u64>, |
There was a problem hiding this comment.
| indices: Vec<u64>, | |
| indices: [u64; 2], |
What index?
If it is leaf index we can do u32.
new registry program instruction nullify2
if client sends 2 instructions in one txn, Solana will deduplicate any overlapping proof nodes between those two instructions, reducing overall txn size.
pr also adds pairing logic inside forester client for v1 state nullification
Summary by CodeRabbit
Release Notes
New Features
Documentation
Chores