Skip to content

Columns pruned within boundary finalization bug #8583

@barnabasbusa

Description

@barnabasbusa

Description

We seems to be facing a bug where lighthouse will downscore every other lighthouse peers with Columns pruned within boundary error message with minimal preset as soon as the chain finalizes (so around slot 34).

Version

Lighthouse: version: v8.0.1-afa6457

Present Behaviour

Right around slot 34/35 when we start finalizing with minimal preset we get the following error messages:

Dec 15 10:34:45.027 DEBUG Sync RPC request error                        id: 4/CustodyBackfill/1/0/0, method: "DataColumnsByRange", error: RpcError(ErrorResponse(ResourceUnavailable, "columns pruned within boundary"))
Dec 15 10:34:45.027 DEBUG Batch download failed                         batch_epoch: 0, error: RpcError(ErrorResponse(ResourceUnavailable, "columns pruned within boundary"))
Dec 15 10:34:45.033 DEBUG RPC Error                                     protocol: data_column_sidecars_by_range, err: RPC response was an error: Resource unavailable with reason: columns pruned within boundary, client: Lighthouse: version: v8.0.1-afa6457, os_version: aarch64-linux, peer_id: 16Uiu2HAm57bpE1Qng6K1Cgv7HMaoyCGbeJko24LYGTqYK1FFCkF8, score: -4.492263661402556, direction: Outgoing
Dec 15 10:34:45.034 DEBUG RPC Error                                     protocol: data_column_sidecars_by_range, err: RPC response was an error: Resource unavailable with reason: columns pruned within boundary, client: Lighthouse: version: v8.0.1-afa6457, os_version: aarch64-linux, peer_id: 16Uiu2HAm57bpE1Qng6K1Cgv7HMaoyCGbeJko24LYGTqYK1FFCkF8, score: -100, direction: Outgoing
Dec 15 10:34:45.034 DEBUG Peer score adjusted                           msg: handle_rpc_error, peer_id: 16Uiu2HAm57bpE1Qng6K1Cgv7HMaoyCGbeJko24LYGTqYK1FFCkF8, score: -100.00

And within 2 slots all LH nodes loose all their peers.

Expected Behaviour

Do not downscore and create n number of forked networks.

Steps to resolve

kurtosis config:

participants_matrix:
  el:
    - el_type: geth
      el_image: ethpandaops/geth:master
    - el_type: erigon
      el_image: ethpandaops/erigon:main
    - el_type: besu
      el_image: ethpandaops/besu:main
    - el_type: nethermind
      el_image: ethpandaops/nethermind:master
    - el_type: reth
      el_image: ethpandaops/reth:main
  cl:
    - cl_type: lighthouse
      cl_image: ethpandaops/lighthouse:unstable

network_params:
  preset: minimal

additional_services:
  - dora
  - spamoor
global_log_level: debug
spamoor_params:
  spammers:
    - scenario: blob-combined
      config:
        throughput: 15
        sidecars: 6

Interestingly, if you add another mnemonic, and ask the chain to never finalize such as:

participants_matrix:
  el:
    - el_type: geth
      el_image: ethpandaops/geth:master
    - el_type: erigon
      el_image: ethpandaops/erigon:main
    - el_type: besu
      el_image: ethpandaops/besu:main
    - el_type: nethermind
      el_image: ethpandaops/nethermind:master
    - el_type: reth
      el_image: ethpandaops/reth:main
    - el_type: ethrex
      el_image: ethpandaops/ethrex:main
  cl:
    - cl_type: lighthouse
      cl_image: ethpandaops/lighthouse:unstable

network_params:
  preset: minimal
  additional_mnemonics:
    - mnemonic: "estate dog switch misery manage room million bleak wrap distance always insane usage busy chicken limit already duck feature unhappy dial emotion expire please"
      count: 600

additional_services:
  - dora
  - spamoor
global_log_level: debug
spamoor_params:
  spammers:
    - scenario: blob-combined
      config:
        throughput: 15
        sidecars: 6

Then the chain will go on, and lighthouse will never unpeer themselves.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions