-
Notifications
You must be signed in to change notification settings - Fork 510
[8.0] PS-9683 : Enable CHECK TABLE EXTENDED to detect InnoDB LOB corruptions #5585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 8.0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 4 out of 6 changed files in this pull request and generated no comments.
Files not reviewed (2)
- mysql-test/suite/innodb/r/percona_extended_check_table_debug.result: Language not supported
- mysql-test/suite/innodb/t/percona_extended_check_table_debug.test: Language not supported
Comments suppressed due to low confidence (3)
storage/innobase/page/page0page.cc:1780
- [nitpick] Consider refactoring 'lob::btr_rec_get_field_ref' to accept a const parameter to avoid using const_cast, which can potentially lead to unsafe modifications.
byte *field_ref = const_cast<byte *>(lob::btr_rec_get_field_ref(index, rec, offsets, i));
storage/innobase/page/page0page.cc:1815
- [nitpick] The use of the magic number 5 in the simulate_lob_corruption block could be replaced with a named constant for improved clarity and maintainability.
if (thread_local_blob_map->size() >= 5) {
storage/innobase/fsp/fsp0fsp.cc:3569
- Verify that handling a null seg_header by simply skipping the inode retrieval is intended; if seg_header can be null, consider adding explicit handling or logging for this case to avoid potential downstream issues.
if (seg_header != nullptr) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy found issue(s) with the introduced code (1/1)
| ulint n_fields = rec_offs_n_fields(offsets); | ||
|
|
||
| for (ulint i = 0; i < n_fields; i++) { | ||
| if (rec_offs_nth_extern(index, offsets, i)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion ulint (aka unsigned long) -> bool
| if (rec_offs_nth_extern(index, offsets, i)) { | |
| if (rec_offs_nth_extern(index, offsets, i) != 0u) { |
| if (rec_offs_nth_extern(index, offsets, i)) { | ||
| // We do const_cast to remove constness because lob::ref_t doesn't have a | ||
| // variant that takes const record pointer | ||
| byte *field_ref = const_cast<byte *>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use const_cast
| goto func_exit; | ||
| } | ||
|
|
||
| if (!page_rec_blob_validate(const_cast<byte *>(rec), index, offsets)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use const_cast
| } | ||
|
|
||
| if (!page_rec_blob_validate(const_cast<byte *>(rec), index, offsets)) { | ||
| goto func_exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid using goto for flow control
| ut_ad(!((page_offset(seg_inode) - FSEG_ARR_OFFSET) % FSEG_INODE_SIZE)); | ||
| ut_a(seg_inode); | ||
| ut_ad(mach_read_from_4(seg_inode + FSEG_MAGIC_N) == FSEG_MAGIC_N_VALUE); | ||
| ut_ad(!((page_offset(seg_inode) - FSEG_ARR_OFFSET) % FSEG_INODE_SIZE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performing an implicit widening conversion to type unsigned long of a multiplication performed in type unsigned int
| } | ||
|
|
||
| /* true if user uses CHECK TABLE t1 EXTENDED */ | ||
| const bool is_extended = check_opt->flags & T_EXTEND; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion unsigned long -> bool
| const bool is_extended = check_opt->flags & T_EXTEND; | |
| const bool is_extended = (check_opt->flags & T_EXTEND) != 0u; |
| } else { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use else after break
| } else { | |
| continue; | |
| } | |
| } continue; | |
https://perconadev.atlassian.net/browse/PS-9683 Problem: -------- In some of the customer environments, it is found that an external LOB's first page is shared between two records. This shouldn't be possible. But it can happen rarely. The root cause is not known yet. Using table in state, can lead to corruption and assertion failures. Fix: --- But we can detect such a scenario by scanning all records and the external LOB's first page. the EXTENDED keyword currently is ignored by InnoDB. We use it to enable the LOB checks and mark index as corrupted if an external LOB's first page is shared between two records. A thread local blob map is used to identify the duplicate user record that has the same external LOB page. usage: ------ CHECK TABLE t1 EXTENDED A sample error log when such corruption is detected: 2025-04-11T10:30:28.078607Z 9 [ERROR] [MY-011825] [InnoDB] Invalid record! External LOB first page cannot be shared between two records 2025-04-11T10:30:28.078625Z 9 [ERROR] [MY-011825] [InnoDB] The external LOB first page is [page id: space=6, page number=347] 2025-04-11T10:30:28.078631Z 9 [ERROR] [MY-011825] [InnoDB] The first occurence of the external LOB first page is in record : page_no: 3 with heap_no: 6 2025-04-11T10:30:28.078638Z 9 [ERROR] [MY-011825] [InnoDB] The second occurence of the external LOB first page is in record: page_no: 4 with heap no: 7 2025-04-11T10:30:28.078646Z 9 [ERROR] [MY-012738] [InnoDB] Apparent corruption in space 6 page 4 index `PRIMARY` 2025-04-11T10:30:28.078663Z 9 [ERROR] [MY-013050] [InnoDB] In page 4 of index `PRIMARY` of table `test`.`t1` 2025-04-11T10:30:28.088156Z 9 [Warning] [MY-012382] [InnoDB] Cannot open table test/t1Please refer to http://dev.mysql.com/doc/refman/8.0/en/innodb-troubleshooting.html for how to resolve the issue.
acdf8e7 to
dde8a8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 4 out of 6 changed files in this pull request and generated no comments.
Files not reviewed (2)
- mysql-test/suite/innodb/r/percona_extended_check_table_debug.result: Language not supported
- mysql-test/suite/innodb/t/percona_extended_check_table_debug.test: Language not supported
Comments suppressed due to low confidence (2)
storage/innobase/page/page0page.cc:1836
- Correct the spelling of 'occurence' to 'occurrence'.
ib::error() << "The first occurence of the external LOB first page is in record : page_no: " << val.first << " with heap_no: " << val.second;
storage/innobase/page/page0page.cc:1839
- Correct the spelling of 'occurence' to 'occurrence'.
ib::error() << "The second occurence of the external LOB first page is " << "in record: page_no: " << page_get_page_no(page) << " with heap no: " << page_rec_get_heap_no(rec);
dlenev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Satya!
Here are some comments about your patch.
Note that I do not insist on replacing thread_local with passing pointer to map as parameter but I think that it will make code more robust.
| CHECK TABLE. */ | ||
| srv_fatal_semaphore_wait_extend.fetch_add(1); | ||
|
|
||
| if (is_extended && index->is_clustered()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a bit more thinking.
I would simply allocated map on the stack and passed pointer to it as a parameter all the way down to page_rec_blob_validate(). To make the change less intrusive we can make the parameter optional and pass nullptr by default. IMO with thread_local approach we are reducing amount of the changes, but OTOH create spooky-action-at-distance situation when the flag on high-level affects execution of low-level code, without being explicitly passed to it. We had quite some problems due to this in the past in MySQL (it was common anti-pattern in our code) and tried to avoid it if possible since then.
So perhaps you can ponder over it once more? OTOH I can live with current implementation if you insist.
| index->name()); | ||
| continue; | ||
|
|
||
| // with extended mode, if clustered index is corrupted, it is marked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor thing, but here and in other places, I think it is a good idea to be consistent in how we write comments. If we write it using sentences then let us start the comment with capital letter "With extended..." and end with dot.
Otherwise it looks somewhat awkward to me.
| lob::btr_rec_get_field_ref(index, rec, offsets, i)); | ||
|
|
||
| lob::ref_t ref(field_ref); | ||
| if (!ref.is_owner() || ref.is_null() || ref.is_null_relaxed() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to check both is_null() and is_null_relaxed() here? AFAIK the latter implies the former.
https://perconadev.atlassian.net/browse/PS-9683
Problem:
In some of the customer environments, it is found that an external LOB's first page is shared between two records. This shouldn't be possible. But it can happen rarely. The root cause is not known yet. Using table in state, can lead to corruption and assertion failures.
Fix:
But we can detect such a scenario by scanning all records and the external LOB's first page. the EXTENDED keyword currently is ignored by InnoDB.
We use it to enable the LOB checks and mark index as corrupted if an external LOB's first page is shared between two records.
A thread local blob map is used to identify the duplicate user record that has the same external LOB page.
usage:
CHECK TABLE t1 EXTENDED
A sample error log when such corruption is detected:
2025-04-11T10:30:28.078607Z 9 [ERROR] [MY-011825] [InnoDB] Invalid record! External LOB first page cannot be shared between two records 2025-04-11T10:30:28.078625Z 9 [ERROR] [MY-011825] [InnoDB] The external LOB first page is [page id: space=6, page number=347] 2025-04-11T10:30:28.078631Z 9 [ERROR] [MY-011825] [InnoDB] The first occurence of the external LOB first page is in record : page_no: 3 with heap_no: 6 2025-04-11T10:30:28.078638Z 9 [ERROR] [MY-011825] [InnoDB] The second occurence of the external LOB first page is in record: page_no: 4 with heap no: 7 2025-04-11T10:30:28.078646Z 9 [ERROR] [MY-012738] [InnoDB] Apparent corruption in space 6 page 4 index
PRIMARY2025-04-11T10:30:28.078663Z 9 [ERROR] [MY-013050] [InnoDB] In page 4 of indexPRIMARYof tabletest.t12025-04-11T10:30:28.088156Z 9 [Warning] [MY-012382] [InnoDB] Cannot open table test/t1Please refer to http://dev.mysql.com/doc/refman/8.0/en/innodb-troubleshooting.html for how to resolve the issue.