fix: SIGSEGV due to out of bounds access after delete + checkpoint by adsharma · Pull Request #612 · LadybugDB/ladybug

adsharma · 2026-06-23T04:57:27Z

Fixes: #610

Root cause

After relationship deletes followed by a checkpoint, an in-memory
CSR index entry (csrIndex->indices[nodeOffset]) can retain a stale
INVALID_ROW_IDX (UINT64_MAX) sentinel row. Passing that through
getQuotientRemainder(row, CHUNKED_NODE_GROUP_CAPACITY) produces
an out-of-range chunk-group index, which truncates to UINT32_MAX
in getGroup(idx_t) — an out-of-bounds std::vector access guarded
only by a debug DASSERT (a no-op in release), so callers dereference
a null/OOB group.

The sentinel survives because checkpointInMemAndOnDisk calls
collectLeafRegionsAndCSRLength (which runs
collectInMemRegionChangesAndUpdateHeaderLength on all leaf regions,
writing INVALID_ROW_IDX via setInvalid(i)), but then takes an
early-return path (line 538, regionsToCheckpoint.empty()) without
calling finalizeCheckpoint — so the index is left with stale
sentinels and the chunked groups are not cleared.

Fix

src/storage/table/csr_node_group.cpp — bounds-check chunkIdx against
chunkedGroups.getNumGroups(lock) before every unguarded getGroup
call site, treating stale rows as skip/deleted (consistent with
the two already-guarded sites at lines 851 and 1060). The 6 fixed
sites:

scanCommittedInMemSequential → return empty result
scanCommittedInMemRandom → skip row
update (COMMITTED_IN_MEMORY) → no-op
delete_ (COMMITTED_IN_MEMORY) → no-op (return false)
collectInMemRegionChangesAndUpdateHeaderLength → treat as deleted, set invalid
populateCSRLengthInMemOnly → treat as deleted, decrement length

### Root cause After relationship deletes followed by a checkpoint, an in-memory CSR index entry (csrIndex->indices[nodeOffset]) can retain a stale INVALID_ROW_IDX (UINT64_MAX) sentinel row. Passing that through getQuotientRemainder(row, CHUNKED_NODE_GROUP_CAPACITY) produces an out-of-range chunk-group index, which truncates to UINT32_MAX in getGroup(idx_t) — an out-of-bounds std::vector access guarded only by a debug DASSERT (a no-op in release), so callers dereference a null/OOB group. The sentinel survives because checkpointInMemAndOnDisk calls collectLeafRegionsAndCSRLength (which runs collectInMemRegionChangesAndUpdateHeaderLength on all leaf regions, writing INVALID_ROW_IDX via setInvalid(i)), but then takes an early-return path (line 538, regionsToCheckpoint.empty()) without calling finalizeCheckpoint — so the index is left with stale sentinels and the chunked groups are not cleared. ### Fix src/storage/table/csr_node_group.cpp — bounds-check chunkIdx against chunkedGroups.getNumGroups(lock) before every unguarded getGroup call site, treating stale rows as skip/deleted (consistent with the two already-guarded sites at lines 851 and 1060). The 6 fixed sites: 1. scanCommittedInMemSequential → return empty result 2. scanCommittedInMemRandom → skip row 3. update (COMMITTED_IN_MEMORY) → no-op 4. delete_ (COMMITTED_IN_MEMORY) → no-op (return false) 5. collectInMemRegionChangesAndUpdateHeaderLength → treat as deleted, set invalid 6. populateCSRLengthInMemOnly → treat as deleted, decrement length

adsharma force-pushed the delete_checkpoint_fix branch from 7900c57 to 5cb87ff Compare June 23, 2026 04:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: SIGSEGV due to out of bounds access after delete + checkpoint#612

fix: SIGSEGV due to out of bounds access after delete + checkpoint#612
adsharma wants to merge 1 commit into
mainfrom
delete_checkpoint_fix

adsharma commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adsharma commented Jun 23, 2026

Root cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant