Skip to content

MDEV-10526: Add binary string support to bitwise operators#5190

Draft
kjarir wants to merge 1 commit into
MariaDB:10.11from
kjarir:feature/MDEV-10526-bitwise-binary
Draft

MDEV-10526: Add binary string support to bitwise operators#5190
kjarir wants to merge 1 commit into
MariaDB:10.11from
kjarir:feature/MDEV-10526-bitwise-binary

Conversation

@kjarir
Copy link
Copy Markdown
Contributor

@kjarir kjarir commented Jun 6, 2026

Draft for review. Implements byte-by-byte binary string mode for all scalar bitwise operators. Aggregate function
support (BIT_AND/BIT_OR/BIT_XOR) to follow.

Tested:

  • All 6 operators on VARBINARY columns
  • INET6 subnet masking (primary use case)
  • NULL handling
  • Mismatched length error
  • Integer mode backward compatibility
  • _binary introducer triggers binary mode correctly

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for binary-to-binary bitwise operations (AND, OR, XOR, NOT, and bitwise shifts) on binary strings by implementing new handler classes and adding corresponding error messages for mismatched operand sizes. The review feedback highlights several critical issues: a potential integer overflow/wrap-around in the left-shift index calculation on 32-bit systems, incorrect ASCII string conversion of numeric arguments in mixed-type bitwise operations, and potential undefined behavior when handling empty binary strings due to passing a null pointer to memset.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sql/item_func.cc
Comment on lines +2266 to +2268
size_t src_idx= i + byte_shift;
if (src_idx < len)
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The calculation size_t src_idx = i + byte_shift is susceptible to integer overflow/wrap-around on 32-bit systems if len is very large (e.g., close to size_t max). If i + byte_shift overflows, src_idx wraps around and can evaluate to a value less than len, leading to incorrect data copying from wrapped-around indices instead of zero-filling. To prevent this, perform the bounds check using subtraction (byte_shift < len - i) before computing src_idx.

        if (byte_shift < len - i)
        {
          size_t src_idx= i + byte_shift;

Comment thread sql/item_func.cc
Comment on lines +7875 to +7876
StringBuffer<128> b_buf;
String *b= item->arguments()[1]->val_str(&b_buf);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In mixed-type bitwise operations (e.g., binary_string & numeric), calling val_str on the numeric argument converts it to its decimal string representation (e.g., 255 becomes the 3-byte ASCII string "255" / 0x323535), rather than its binary representation (e.g., 0xFF). This causes mismatched length errors (e.g., VARBINARY(1) & 255 compares length 1 with length 3) or incorrect bitwise operations if the lengths happen to match. To align with standard bitwise behavior (and MySQL 8.0 compatibility), numeric arguments should be converted to a binary string representation of their integer value, padded or truncated to match the length of the other binary string operand.

Comment thread sql/item_func.cc
Comment on lines +2243 to +2245
size_t len= a->length();

if (to->realloc(len))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When the input binary string a is empty (len == 0), calling to->realloc(0) can return a null pointer or do nothing. Subsequently, passing a null pointer to memset (e.g., memset(out_ptr, 0, len)) is technically undefined behavior in C/C++, even if the length is 0. Adding an early exit for len == 0 avoids this potential undefined behavior and improves efficiency by bypassing unnecessary allocation and loop overhead.

    size_t len= a->length();
    if (len == 0)
    { 
      to->length(0);
      to->set_charset(&my_charset_bin);
      item->null_value= false;
      return to;
    }

    if (to->realloc(len))

Bitwise operators (&, |, ^, ~, <<, >>) previously cast all
arguments to BIGINT, silently truncating values wider than
64 bits. This broke operations on BINARY, VARBINARY, BLOB,
INET6, and UUID columns.

Introduces binary_mode detection in fix_length_and_dec().
When any non-literal argument has STRING_RESULT with binary
charset, operators switch to byte-by-byte processing via
a new Handler_str subclass, returning LONGBLOB of the same
length as the input.

Bare hex literals (x'FF', 0xFF) and bit literals (b'1010')
retain integer mode for backward compatibility.

Existing int/decimal handler classes for Item_func_bit_or
and Item_func_bit_and are moved from item_cmpfunc.cc to
item_func.cc for consistency.

New error codes:
  ER_INVALID_BITWISE_OPERANDS_SIZE
  ER_INVALID_BITWISE_AGGREGATE_OPERANDS_SIZE

Aggregate function support (BIT_AND/BIT_OR/BIT_XOR) to
follow in a subsequent commit.

Closes: MDEV-10526
@kjarir kjarir force-pushed the feature/MDEV-10526-bitwise-binary branch from fa5cd7c to 862a606 Compare June 6, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant