-
Notifications
You must be signed in to change notification settings - Fork 234
Description
Summary
When committing a large transaction, there's a race condition between the transaction commit process and the transport job that can lead to inconsistent state after crash recovery. Specifically, buffer pages may be transported to disk while the inventory's in-memory state is not yet synced, creating inaccurate verification results after recovery.
Problem Description
During a large transaction commit:
- A backup of the transaction is taken (standard procedure for all transactions)
- The transaction commits to the Buffer using group sync which does not immediately flush page contents or Inventory to disk
- The Buffer content is placed in a MappedByteBuffer even when not immediately synced
- The Inventory content remains only in memory until explicitly synced
The issue occurs in this sequence:
- A transaction causes Buffer expansion with new Pages
- The transport job runs concurrently during transaction commit
- Some Buffer pages are transported to the Database
- A crash occurs before commit completion
- Upon restart, the server attempts to replay the transaction
- The system detects partial commit and tries to finish it
- However, the dirty in-memory Inventory content was never synced to disk
This creates a scenario where verifications return incorrect results because records that should be in the Inventory don't appear there. This also explains why we're seeing an influx of unoffset writes in iXL production. Because verifyFast (which leverages the inventory) is inaccurate, it is permitting unoffset writes to be inserted after the transaction commits.