Skip to content

Conversation

@satya-bodapati
Copy link
Contributor

https://perconadev.atlassian.net/browse/PS-9683

Problem:

In some of the customer environments, it is found that an external LOB's first page is shared between two records. This shouldn't be possible. But it can happen rarely. The root cause is not known yet. Using table in state, can lead to corruption and assertion failures.

Fix:

But we can detect such a scenario by scanning all records and the external LOB's first page. the EXTENDED keyword currently is ignored by InnoDB.

We use it to enable the LOB checks and mark index as corrupted if an external LOB's first page is shared between two records.

A thread local blob map is used to identify the duplicate user record that has the same external LOB page.

usage:

CHECK TABLE t1 EXTENDED

A sample error log when such corruption is detected:

2025-04-11T10:30:28.078607Z 9 [ERROR] [MY-011825] [InnoDB] Invalid record! External LOB first page cannot be shared between two records 2025-04-11T10:30:28.078625Z 9 [ERROR] [MY-011825] [InnoDB] The external LOB first page is [page id: space=6, page number=347] 2025-04-11T10:30:28.078631Z 9 [ERROR] [MY-011825] [InnoDB] The first occurence of the external LOB first page is in record : page_no: 3 with heap_no: 6 2025-04-11T10:30:28.078638Z 9 [ERROR] [MY-011825] [InnoDB] The second occurence of the external LOB first page is in record: page_no: 4 with heap no: 7 2025-04-11T10:30:28.078646Z 9 [ERROR] [MY-012738] [InnoDB] Apparent corruption in space 6 page 4 index PRIMARY 2025-04-11T10:30:28.078663Z 9 [ERROR] [MY-013050] [InnoDB] In page 4 of index PRIMARY of table test.t1 2025-04-11T10:30:28.088156Z 9 [Warning] [MY-012382] [InnoDB] Cannot open table test/t1Please refer to http://dev.mysql.com/doc/refman/8.0/en/innodb-troubleshooting.html for how to resolve the issue.

@satya-bodapati satya-bodapati requested a review from Copilot April 14, 2025 10:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 6 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • mysql-test/suite/innodb/r/percona_extended_check_table_debug.result: Language not supported
  • mysql-test/suite/innodb/t/percona_extended_check_table_debug.test: Language not supported
Comments suppressed due to low confidence (3)

storage/innobase/page/page0page.cc:1780

  • [nitpick] Consider refactoring 'lob::btr_rec_get_field_ref' to accept a const parameter to avoid using const_cast, which can potentially lead to unsafe modifications.
byte *field_ref = const_cast<byte *>(lob::btr_rec_get_field_ref(index, rec, offsets, i));

storage/innobase/page/page0page.cc:1815

  • [nitpick] The use of the magic number 5 in the simulate_lob_corruption block could be replaced with a named constant for improved clarity and maintainability.
if (thread_local_blob_map->size() >= 5) {

storage/innobase/fsp/fsp0fsp.cc:3569

  • Verify that handling a null seg_header by simply skipping the inode retrieval is intended; if seg_header can be null, consider adding explicit handling or logging for this case to avoid potential downstream issues.
if (seg_header != nullptr) {

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

ulint n_fields = rec_offs_n_fields(offsets);

for (ulint i = 0; i < n_fields; i++) {
if (rec_offs_nth_extern(index, offsets, i)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion ulint (aka unsigned long) -> bool

Suggested change
if (rec_offs_nth_extern(index, offsets, i)) {
if (rec_offs_nth_extern(index, offsets, i) != 0u) {

if (rec_offs_nth_extern(index, offsets, i)) {
// We do const_cast to remove constness because lob::ref_t doesn't have a
// variant that takes const record pointer
byte *field_ref = const_cast<byte *>(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast

goto func_exit;
}

if (!page_rec_blob_validate(const_cast<byte *>(rec), index, offsets)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-pro-type-const-cast ⚠️
do not use const_cast

}

if (!page_rec_blob_validate(const_cast<byte *>(rec), index, offsets)) {
goto func_exit;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-avoid-goto ⚠️
avoid using goto for flow control

ut_ad(!((page_offset(seg_inode) - FSEG_ARR_OFFSET) % FSEG_INODE_SIZE));
ut_a(seg_inode);
ut_ad(mach_read_from_4(seg_inode + FSEG_MAGIC_N) == FSEG_MAGIC_N_VALUE);
ut_ad(!((page_offset(seg_inode) - FSEG_ARR_OFFSET) % FSEG_INODE_SIZE));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ bugprone-implicit-widening-of-multiplication-result ⚠️
performing an implicit widening conversion to type unsigned long of a multiplication performed in type unsigned int

}

/* true if user uses CHECK TABLE t1 EXTENDED */
const bool is_extended = check_opt->flags & T_EXTEND;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ readability-implicit-bool-conversion ⚠️
implicit conversion unsigned long -> bool

Suggested change
const bool is_extended = check_opt->flags & T_EXTEND;
const bool is_extended = (check_opt->flags & T_EXTEND) != 0u;

Comment on lines +18971 to +18952
} else {
continue;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ llvm-else-after-return ⚠️
do not use else after break

Suggested change
} else {
continue;
}
} continue;

https://perconadev.atlassian.net/browse/PS-9683

Problem:
--------
In some of the customer environments, it is found that an external LOB's first
page is shared between two records. This shouldn't be possible. But it can happen
rarely. The root cause is not known yet. Using table in state, can lead to corruption
and assertion failures.

Fix:
---
But we can detect such a scenario by scanning all records and the external LOB's first
page. the EXTENDED keyword currently is ignored by InnoDB.

We use it to enable the LOB checks and mark index as corrupted if an external LOB's first
page is shared between two records.

A thread local blob map is used to identify the duplicate user record
that has the same external LOB page.

usage:
------

CHECK TABLE t1 EXTENDED

A sample error log when such corruption is detected:

2025-04-11T10:30:28.078607Z 9 [ERROR] [MY-011825] [InnoDB] Invalid record! External LOB first page cannot be shared between two records
2025-04-11T10:30:28.078625Z 9 [ERROR] [MY-011825] [InnoDB] The external LOB first page is [page id: space=6, page number=347]
2025-04-11T10:30:28.078631Z 9 [ERROR] [MY-011825] [InnoDB] The first occurence of the external LOB first page is in record : page_no: 3 with heap_no: 6
2025-04-11T10:30:28.078638Z 9 [ERROR] [MY-011825] [InnoDB] The second occurence of the external LOB first page is in record: page_no: 4 with heap no: 7
2025-04-11T10:30:28.078646Z 9 [ERROR] [MY-012738] [InnoDB] Apparent corruption in space 6 page 4 index `PRIMARY`
2025-04-11T10:30:28.078663Z 9 [ERROR] [MY-013050] [InnoDB] In page 4 of index `PRIMARY` of table `test`.`t1`
2025-04-11T10:30:28.088156Z 9 [Warning] [MY-012382] [InnoDB] Cannot open table test/t1Please refer to http://dev.mysql.com/doc/refman/8.0/en/innodb-troubleshooting.html for how to resolve the issue.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 6 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • mysql-test/suite/innodb/r/percona_extended_check_table_debug.result: Language not supported
  • mysql-test/suite/innodb/t/percona_extended_check_table_debug.test: Language not supported
Comments suppressed due to low confidence (2)

storage/innobase/page/page0page.cc:1836

  • Correct the spelling of 'occurence' to 'occurrence'.
ib::error() << "The first occurence of the external LOB first page is in record : page_no: " << val.first << " with heap_no: " << val.second;

storage/innobase/page/page0page.cc:1839

  • Correct the spelling of 'occurence' to 'occurrence'.
ib::error() << "The second occurence of the external LOB first page is " << "in record: page_no: " << page_get_page_no(page) << " with heap no: " << page_rec_get_heap_no(rec);

@satya-bodapati satya-bodapati requested a review from dlenev April 15, 2025 09:30
@satya-bodapati satya-bodapati changed the title PS-9683 : Enable CHECK TABLE EXTENDED to detect InnoDB LOB corruptions [8.0] PS-9683 : Enable CHECK TABLE EXTENDED to detect InnoDB LOB corruptions Apr 15, 2025
@satya-bodapati satya-bodapati self-assigned this Apr 15, 2025
Copy link
Contributor

@dlenev dlenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Satya!

Here are some comments about your patch.

Note that I do not insist on replacing thread_local with passing pointer to map as parameter but I think that it will make code more robust.

CHECK TABLE. */
srv_fatal_semaphore_wait_extend.fetch_add(1);

if (is_extended && index->is_clustered()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a bit more thinking.

I would simply allocated map on the stack and passed pointer to it as a parameter all the way down to page_rec_blob_validate(). To make the change less intrusive we can make the parameter optional and pass nullptr by default. IMO with thread_local approach we are reducing amount of the changes, but OTOH create spooky-action-at-distance situation when the flag on high-level affects execution of low-level code, without being explicitly passed to it. We had quite some problems due to this in the past in MySQL (it was common anti-pattern in our code) and tried to avoid it if possible since then.

So perhaps you can ponder over it once more? OTOH I can live with current implementation if you insist.

index->name());
continue;

// with extended mode, if clustered index is corrupted, it is marked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing, but here and in other places, I think it is a good idea to be consistent in how we write comments. If we write it using sentences then let us start the comment with capital letter "With extended..." and end with dot.
Otherwise it looks somewhat awkward to me.

lob::btr_rec_get_field_ref(index, rec, offsets, i));

lob::ref_t ref(field_ref);
if (!ref.is_owner() || ref.is_null() || ref.is_null_relaxed() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to check both is_null() and is_null_relaxed() here? AFAIK the latter implies the former.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants