Skip to content

htlcswitch: add htlc state machine fuzz tests#10773

Open
NishantBansal2003 wants to merge 1 commit intolightningnetwork:masterfrom
NishantBansal2003:fuzz-htlcswitch
Open

htlcswitch: add htlc state machine fuzz tests#10773
NishantBansal2003 wants to merge 1 commit intolightningnetwork:masterfrom
NishantBansal2003:fuzz-htlcswitch

Conversation

@NishantBansal2003
Copy link
Copy Markdown
Contributor

This commit adds fuzz tests for the normal channel operations state machine. The fuzzer provides inputs, and based on those inputs, we simulate various random HTLC switch states. These include sending add_htlc, commit_sig, update_fee, revoke_ack, fulfill_htlc, fail_htlc, malformed_htlc, updating the block height, and restarting the node. Both valid and malformed messages, along with invalid HTLC ordering, are used to help uncover any unexpected behavior caused by malicious nodes.

In these fuzz tests, there are local and remote channel links. The local node is honest, while the remote node can behave either normally or as an attacker and may act arbitrarily. The main property we are testing is that no state should cause a panic in the victim node. Any link failures on the victim node are expected (we do not check link failures on the attacker node, since arbitrary messages sent to the victim may result in unexpected responses being sent back). Additionally, after each state transition, channel invariants are verified, such as total balance consistency, no loss of funds, and ensuring that the commitment height delta between peers is at most 1.

A lot of thorough review has been done in NishantBansal2003#7, and I think it is now a good time to open this upstream officially. After running it for a long time, I haven’t found any major bugs in the htlcswitch codebase. Also, this PR depends on #10622, so for now I have included the changes from that PR, but will rebase once it is merged.

Since these fuzz tests cover extensive state combinations, I believe it would be beneficial to run them continuously. This would serve as a regression check for the existing code and could help uncover bugs introduced by future changes.

cc: @morehouse

Signed-off-by: Nishant Bansal <nishant.bansal.282003@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust fuzz testing framework for the HTLC switch state machine in LND. By simulating various network conditions, message sequences, and node lifecycle events, the new tests aim to ensure the stability and correctness of the HTLC state machine under adversarial or malformed input scenarios. Additionally, minor improvements and safety fixes were applied to the channel database and test utilities to support these new testing capabilities.

Highlights

  • HTLC State Machine Fuzzing: Introduced a comprehensive fuzz testing harness for the HTLC switch state machine, allowing for the simulation of complex channel operations, message mutations, and node restarts to uncover potential panics or invariant violations.
  • Test Infrastructure Improvements: Enhanced existing test utilities to support more granular channel configuration, including custom dust limits, fee rates, and HTLC limits, and added resource cleanup to prevent leaks during test execution.
  • Bug Fix in Channel DB: Updated the channel database logic to safely handle cases where unsigned acked updates might be missing, preventing potential nil pointer dereferences during commitment chain advancement.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label Apr 27, 2026
@github-actions
Copy link
Copy Markdown

🔴 PR Severity: CRITICAL

First classification | 3 files | 126 lines changed

🔴 Critical (3 files)
  • channeldb/channel.go - channel state persistence (channeldb/*)
  • htlcswitch/mock.go - HTLC switch mock (htlcswitch/*)
  • htlcswitch/test_utils.go - HTLC switch test utilities (htlcswitch/*)
🟢 Low / Excluded (2 files — test files, not counted)
  • htlcswitch/fuzz_test.go - fuzz test (*_test.go, excluded)
  • htlcswitch/link_test.go - link test (*_test.go, excluded)

Analysis

This PR touches two CRITICAL packages: channeldb and htlcswitch. Both are core to Lightning Network safety — channeldb handles channel state persistence and htlcswitch is responsible for HTLC forwarding and the payment routing state machine.

The bulk of the line additions (1,632) come from htlcswitch/fuzz_test.go, a new fuzz test which is excluded from severity counting. The non-test production-relevant changes are modest (~126 lines across 3 files), but their location in critical packages warrants expert review.

No severity bump was triggered: non-test file count (3) is well under 20, and non-test lines changed (~126) are well under 500.


To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a comprehensive fuzz testing framework for the HTLC state machine, including a new fuzz_test.go file and updates to related mocks and test utilities. It also refactors AdvanceCommitChainTail in channeldb to handle missing unsigned updates more gracefully. The review feedback highlights that the fuzzer should handle signature parsing errors by skipping them instead of using require.NoError, which causes the test to fail prematurely. Additionally, a function comment requires a minor grammatical fix to adhere to the repository's style guide.

Comment thread htlcswitch/fuzz_test.go
Comment on lines +993 to +995
sig, err := lnwire.NewSigFromWireECDSA(b)
require.NoError(fn.t, err)
out.CommitSig = sig
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using require.NoError here will cause the fuzzer to stop and report a failure whenever the random bytes provided by the fuzzer do not form a valid signature (e.g., due to scalar overflow). Since the fuzzer is intended to explore the state space with random mutations, it should handle such parsing errors gracefully by skipping the mutation instead of failing the test.

				sig, err := lnwire.NewSigFromWireECDSA(b)
				if err != nil {
					return
				}
				out.CommitSig = sig

Comment thread htlcswitch/fuzz_test.go
Comment on lines +1024 to +1026
sig, err := lnwire.NewSigFromWireECDSA(fn.getBytes(64))
require.NoError(fn.t, err)
mutatedHtlcSigs = append(mutatedHtlcSigs, sig)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the CommitSig mutation, using require.NoError when parsing random bytes into an HTLC signature will cause the fuzzer to crash on validly generated random data that happens to be an invalid scalar. This should be handled by skipping the addition of the signature if parsing fails.

						sig, err := lnwire.NewSigFromWireECDSA(fn.getBytes(64))
						if err != nil {
							continue
						}
						mutatedHtlcSigs = append(mutatedHtlcSigs, sig)

Comment thread htlcswitch/fuzz_test.go
}
}

// maybeReorderMessages conditionally reorder outbound messages from the remote
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function comment for maybeReorderMessages contains a grammatical error. It should use the third-person singular form "reorders" to be a complete sentence, as required by the style guide.

Suggested change
// maybeReorderMessages conditionally reorder outbound messages from the remote
// maybeReorderMessages conditionally reorders outbound messages from the remote
References
  1. Function comments should be complete sentences and begin with the function name. (link)

Copy link
Copy Markdown
Contributor

@MPins MPins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @NishantBansal2003, as a first step in the review I ran the fuzz test on a dedicated machine (24 cores, 64GB RAM), and as you can see below, the fuzzing process eventually hung.

For context, I’m also working on a similar problem (link), and I ran into the same issue. I was able to resolve it by mocking signature operations (sign/verify), key derivation, and running the database in memory. This increased the execution rate from a few executions per second (similar to what you’re seeing) to hundreds per second.

go test -v -run=^$ -fuzz=FuzzHTLCStates -parallel=16

fuzz: elapsed: 41m6s, execs: 74208 (3/sec), new interesting: 541 (total: 544)
fuzz: elapsed: 41m9s, execs: 74211 (1/sec), new interesting: 542 (total: 545)
fuzz: elapsed: 41m12s, execs: 74214 (1/sec), new interesting: 544 (total: 547)
fuzz: elapsed: 41m15s, execs: 74220 (2/sec), new interesting: 544 (total: 547)
fuzz: elapsed: 41m18s, execs: 74227 (2/sec), new interesting: 544 (total: 547)
fuzz: elapsed: 41m21s, execs: 74235 (3/sec), new interesting: 544 (total: 547)
fuzz: elapsed: 41m24s, execs: 74240 (2/sec), new interesting: 544 (total: 547)
fuzz: elapsed: 41m27s, execs: 74248 (3/sec), new interesting: 546 (total: 549)
fuzz: elapsed: 41m30s, execs: 74254 (2/sec), new interesting: 548 (total: 551)
fuzz: elapsed: 41m33s, execs: 74262 (3/sec), new interesting: 549 (total: 552)
fuzz: elapsed: 41m36s, execs: 74271 (3/sec), new interesting: 550 (total: 553)
fuzz: elapsed: 41m39s, execs: 74292 (7/sec), new interesting: 550 (total: 553)
fuzz: elapsed: 41m40s, execs: 74292 (0/sec), new interesting: 550 (total: 553)
--- FAIL: FuzzHTLCStates (2500.07s)
    fuzzing process hung or terminated unexpectedly while minimizing: EOF
    Failing input written to testdata/fuzz/FuzzHTLCStates/72e690e1fb2396a9
    To re-run:
    go test -run=FuzzHTLCStates/72e690e1fb2396a9
=== NAME  
FAIL
exit status 1
FAIL    github.com/lightningnetwork/lnd/htlcswitch      2500.205s
root@m24314:~/lnd/htlcswitch# go test -run=FuzzHTLCStates/72e690e1fb2396a9
PASS
ok      github.com/lightningnetwork/lnd/htlcswitch      0.447s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

severity-critical Requires expert review - security/consensus critical

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants