bound the trailing hex read in my_mb_wc_filename#5187
Conversation
There was a problem hiding this comment.
Code Review
This pull request fixes an out-of-bounds read vulnerability in my_mb_wc_filename by ensuring that the end pointer is checked before reading the 4th hex digit of a truncated escape sequence. It also adds a unit test to verify this behavior. The review comments suggest improving the unit test by returning the actual return value of my_ci_mb_wc and asserting the expected error code (MY_CS_ILSEQ) directly, which makes the test more idiomatic and direct.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| static int test_mb_wc_filename_truncated() | ||
| { | ||
| CHARSET_INFO *cs= &my_charset_filename; | ||
| uchar buf[5]= {'@', '0', '0', '0', '1'}; | ||
| my_wc_t wc= 0; | ||
| int rc= my_ci_mb_wc(cs, &wc, buf, buf + 4); | ||
| return rc != MY_CS_ILSEQ; | ||
| } |
There was a problem hiding this comment.
Returning the actual return value of my_ci_mb_wc rather than a boolean mapping makes the test more direct and easier to understand. We can then assert the expected error code (MY_CS_ILSEQ) directly in the test runner.
static int test_mb_wc_filename_truncated()
{
CHARSET_INFO *cs= &my_charset_filename;
uchar buf[5]= {'@', '0', '0', '0', '1'};
my_wc_t wc= 0;
return my_ci_mb_wc(cs, &wc, buf, buf + 4);
}| ok(test_mb_wc_filename_truncated() == 0, | ||
| "filename decoder does not read past the end pointer"); |
my_mb_wc_filename decodes the 5-byte '@hhhh' escape but the length guard is off by one:
s + 4 > echeck only covers 4 bytes while the hex branch reads s[4]@HHHwhose buffer ends at s+4 over-reads one byte paste, reachable through well_formed_length/charpos on a my_charset_filename string with tight boundss[3] ?guard only stops at a NUL terminator, not at the end pointerbound the s[4] read against
e; added a strings-t case that returns ILSEQ instead of over-reading (ASAN flags the read of size 1 past a 4-byte buffer before the fix).