Skip to content

feat(tesseract): Support boolean and numeric filter values#11135

Merged
ovr merged 4 commits into
masterfrom
feat/tesseract-bool-numeric-fitler-values
Jun 22, 2026
Merged

feat(tesseract): Support boolean and numeric filter values#11135
ovr merged 4 commits into
masterfrom
feat/tesseract-bool-numeric-fitler-values

Conversation

@ovr

@ovr ovr commented Jun 22, 2026

Copy link
Copy Markdown
Member

Tesseract deserialized filter values as Vec<Option<String>>, so a boolean (or numeric) value in a query filter — e.g. {operator: 'equals', values: [true]} — failed with "invalid type: boolean true, expected a string". The legacy JS planner never hit this since JS is untyped.

Introduce a typed FilterValue { Str, Bool, Num, Null } enum and thread it end-to-end (deserialization → TypedFilter → scalar operators → param rendering). Values are normalized to their canonical string form only at the allocate_param boundary, so the param channel stays Vec<String>.

Tesseract deserialized filter `values` as `Vec<Option<String>>`, so a
boolean (or numeric) value in a query filter — e.g.
`{operator: 'equals', values: [true]}` — failed with
"invalid type: boolean `true`, expected a string". The legacy JS planner
never hit this since JS is untyped.

Introduce a typed `FilterValue { Str, Bool, Num, Null }` enum and thread it
end-to-end (deserialization → TypedFilter → scalar operators → param
rendering). Values are normalized to their canonical string form only at
the `allocate_param` boundary, so the param channel stays `Vec<String>`.
@github-actions github-actions Bot added the rust Pull requests that update Rust code label Jun 22, 2026
@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @ovr's task in 2m 46s —— View job


Review — feat(tesseract): typed FilterValue for filter values

Overall: clean refactor. Introducing FilterValue { Str, Bool, Num, Null } and normalizing to a parameter string only at the allocate_param boundary is the right shape — the param channel stays Vec<String> and the typed information lives at the planner layer. The end-to-end thread through deserializer → TypedFilter → operators → param rendering is consistent, and the call sites mechanically updated look correct.

Verdict: approve in spirit — 0 blocking, 1 medium, 4 low. The medium one (NULL-safety in build_sql_and_params) is worth resolving before merge.

Details (click to expand)

Tasks

  • Fetch and inspect diff (28 files)
  • Review FilterValue type + serde impls
  • Review planner/physical operator threading
  • Review param-rendering boundary
  • Review test coverage
  • Post inline comments

Findings

# Sev File:Line Concern
1 medium planner/query_tools.rs:201 unwrap_or_default() silently maps FilterValue::Null allocated params to "" instead of preserving SQL NULL
2 low cubenativeutils/.../deserializer.rs:42-50 deserialize_any int/float split affects all native deserialization, not just FilterValue — worth a dedicated unit test
3 low base_query_options.rs:32-43 format_number cutoff < 1e15 is non-standard; Display for f64 won't match JS stringification exactly
4 low filter/filter_debug.rs:12-17 Debug output now wraps booleans/numbers in single quotes, looking like strings
5 low (test coverage) No new tests exercise the actual motivating case (values: [true] / [42]) end-to-end or FilterValue serde round-trips

Other observations

  • FilterValue::to_param_string clones the String for the Str variant on every call — minor perf nit on hot paths (allocate_and_cast_values); Cow<str> would be a free upgrade later but not worth blocking on.
  • Replacing the todo!() in deserialize_enum with a clear error is a nice incidental fix.
  • first_non_null_string correctly delegates to first_non_null_value then to_param_string; the trailing ok_or_else is unreachable (the value was already verified non-null) but harmless.
  • Test fixtures and change_operator use_raw_values tests were updated consistently; no diff drift spotted.
· `feat/tesseract-bool-numeric-fitler-values`

Comment thread rust/cube/cubesqlplanner/cubesqlplanner/src/planner/query_tools.rs Outdated
Comment on lines 42 to +50
} else if let Ok(val) = self.input.to_number() {
visitor.visit_i64(val.value().unwrap() as i64) //We deserialize float value in
//different methods
let num = val.value().unwrap();
// Preserve fractional numbers as floats; only integral values are
// narrowed to i64 (whole-number JS values are the common case, and
// self-describing consumers like FilterValue expect them as ints).
if num.fract() == 0.0 && num.is_finite() {
visitor.visit_i64(num as i64)
} else {
visitor.visit_f64(num)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This narrowing applies to every call to the native deserialize_any (not just FilterValue). Previously every JS number was forwarded as i64 and serde coerced it to whatever the target type expected; now integral values go to visit_i64 and fractional values to visit_f64.

Two latent risks worth a quick check:

  1. Any existing struct field deserialized via this path that was previously receiving truncated integer values from fractional JS numbers will now see the original f64. That's strictly an improvement for correct callers but could surface previously-silent data loss in callers that targeted i64/u64 and relied on the truncation.
  2. f64::NAN is !is_finite(), so it goes to visit_f64(NaN) — most serde targets reject this, which is probably desired but worth noting.

Adding a unit test in deserializer.rs covering the four shapes (int, float, NaN, very large integral) would lock the contract down.

Comment on lines +32 to +43

pub fn as_str(&self) -> Option<&str> {
match self {
FilterValue::Str(s) => Some(s.as_str()),
_ => None,
}
}

/// Canonical string representation bound as a SQL parameter. `Null` yields
/// `None` (the value is dropped / handled as `IS NULL`). Whole numbers are
/// rendered without a trailing `.0` (`42.0` → `"42"`).
pub fn to_param_string(&self) -> Option<String> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small things on format_number:

  • format!("{}", n as i64) for the integral fast path silently saturates if n is outside i64::MIN..=i64::MAX. The n.abs() < 1e15 guard keeps us well inside that range — good — but it also excludes values like 1e15 itself even though they're exactly representable and fit in i64. Using n.abs() <= (1_i64 << 53) as f64 is the conventional cutoff (the largest f64 with unique integer round-trip) and matches what JS Number.isSafeInteger enforces upstream.
  • For the fallback format!("{}", n), the default Display for f64 uses minimum digits for round-tripping. That's fine for most values but won't match how the JS side originally stringified the number (e.g. JS may emit 1e-7 while Rust emits 0.0000001). If anything downstream compares the rendered param against an exact string, this can diverge. Probably out-of-scope for this PR, but worth a comment noting the format is "Rust default, not JS-compatible".

Comment thread rust/cube/cubesqlplanner/cubesqlplanner/src/planner/filter/filter_debug.rs Outdated
@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.12%. Comparing base (155987f) to head (a39a58b).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #11135       +/-   ##
===========================================
+ Coverage   58.74%   79.12%   +20.37%     
===========================================
  Files         217      473      +256     
  Lines       17495    96514    +79019     
  Branches     3558     3558               
===========================================
+ Hits        10278    76364    +66086     
- Misses       6699    19632    +12933     
  Partials      518      518               
Flag Coverage Δ
cube-backend 58.74% <ø> (ø)
cubesql 83.63% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @ovr's task in 1m 9s —— View job


Re-review — chore: wip commit (0ee60a3)

The follow-up commit addresses the previous medium-severity concern: instead of collapsing allocated params to strings via unwrap_or_default(), the param channel is now Vec<FilterValue> end-to-end through ParamsAllocator. FilterValue::Null now reaches the rendering boundary intact, so it can be inlined as bare NULL (the test fixture inline_params does exactly this) and won't silently become ''. Resolving that thread.

Verdict: 0 blocking. The remaining open threads still apply.

Details (click to expand)

Tasks

  • Inspect new commit (0ee60a3 — 13 files)
  • Verify the NULL-safety fix in build_sql_and_params / ParamsAllocator
  • Re-check open review threads against current diff
  • Resolve threads that are no longer applicable

Changes in 0ee60a3

Area Change
params_allocator.rs params: Vec<FilterValue>; build_sql_and_params returns Vec<FilterValue>
query_tools.rs get_allocated_params returns Vec<FilterValue>; no more unwrap_or_default() collapse
physical_plan/filter/operators/like.rs Uses new allocate_and_cast_str helper — avoids the String → FilterValue::Str → param_string round-trip
physical_plan/filter/operators/filter_sql_context.rs New allocate_and_cast_str(value: &str, …) factored out of allocate_and_cast_value
planner/filter/typed_filter.rs first_non_null_string documents the unreachable fallback; switches to unwrap_or_default() with a comment
test_fixtures/test_utils/test_context.rs inline_params now inlines Null as bare NULL (rather than '') — correct
tests/filter/mod.rs assert_filter renders Null as the literal string "NULL" for comparison
base_query_options.rs Removed unused as_str()

Threads status

Thread Status
1. NULL-safety in build_sql_and_params (medium) Resolved — fix landed in 0ee60a3
2. deserialize_any int/float split (low) Still applies — no unit test added
3. format_number integral cutoff / JS-compat (low) Still applies — unchanged
4. filter_debug.rs quoting bool/num (low) Still applies — unchanged
5. No end-to-end tests for values: [true] / [42] (low) Still applies — no new tests on the motivating case

Notes on the fix shape

  • The new Vec<FilterValue> param channel is the right move — it keeps the type info all the way to the driver boundary, so the driver can choose between IS NULL and bound NULL rendering itself instead of receiving a stringly-typed sentinel.
  • first_non_null_string's unwrap_or_default() is now annotated as unreachable — fair, since first_non_null_value filters out nulls and non-null FilterValue::to_param_string always returns Some. An expect("non-null FilterValue renders to Some") would surface invariant violations more loudly than unwrap_or_default(), but this is style preference, not a defect.
  • The new allocate_and_cast_str is a clean factoring; the previous LikeOp round-trip through FilterValue::Str(v.clone()) only to immediately call to_param_string() is gone.
· `feat/tesseract-bool-numeric-fitler-values`

@ovr ovr force-pushed the feat/tesseract-bool-numeric-fitler-values branch from 0ee60a3 to 5f72289 Compare June 22, 2026 16:01
@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @ovr's task in 1m 17s —— View job


Re-review — chore: wip (5f72289)

Force-pushed wip carries the same structural change covered in the previous re-review (params channel now Vec<FilterValue> end-to-end; Null reaches the rendering boundary intact rather than collapsing to ""). Diffing against the prior wip, the one substantive new addition is a #[cfg(test)] mod tests block in params_allocator.rs:

  • allocated_params_enter_channel_as_str — asserts internally-allocated params land as FilterValue::Str and produce $_N_$ placeholders.
  • native_params_keep_natural_type_including_null — asserts the JS-side merge in add_native_allocated_params preserves Null/Num/Bool variants verbatim instead of stringifying them.

The second test is the right test to write for the NULL-safety fix — it locks the regression behavior in at the unit level. Resolving the medium thread was already done in the previous pass; these tests turn the fix into a pinned contract.

Verdict: 0 blocking. Still nothing new to flag.

Details (click to expand)

Tasks

  • Diff 5f72289 vs prior wip
  • Verify new tests in params_allocator.rs actually exercise the NULL-safety boundary
  • Re-check open review threads against current diff
  • Avoid duplicate inline comments — prior threads still cover the open concerns

Open-thread status (unchanged from previous re-review)

Thread Status
1. NULL-safety in build_sql_and_params (medium) Resolved — fix landed; unit-test coverage added in this commit
2. deserialize_any int/float split (low) Still applies — no test added for deserializer.rs
3. format_number integral cutoff / JS-compat (low) Still applies — unchanged
4. filter_debug.rs quoting bool/num (low) Still applies — unchanged
5. No end-to-end tests for values: [true] / [42] (low) Partially addressed at the unit level (param-channel boundary). A higher-level test that builds a FilterItem with a boolean/numeric value and asserts rendered SQL + params would still be valuable for the motivating case

Notes

  • native_params_keep_natural_type_including_null asserts the exact behavior that previously regressed (Null arriving as "") — exactly the right pin.
  • allocate_and_cast_str factoring in filter_sql_context.rs is retained and LikeOp uses it; the prior String → FilterValue::Str → param_string round-trip is gone.
  • first_non_null_string's comment + unwrap_or_default() continues to be a stylistic choice rather than a defect; expect(...) would surface invariant violations more loudly, but this is fine as-is.
· `feat/tesseract-bool-numeric-fitler-values`

@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @ovr's task in 41s —— View job


Re-review — chore: review fixes (527a8c9)

This commit addresses thread #4 from the original review. filter_debug.rs no longer wraps every value in single quotes via to_param_string — instead it delegates to a new FilterValue::to_debug_string() that renders Str quoted, Bool/Num bare, and Null as NULL. Two new unit tests in tests/utils/debug.rs pin the rendering for mixed [Num, Bool, Str] and [Null] inputs.

Resolving that thread. The remaining three open threads (deserializer narrowing test, format_number cutoff/JS-compat note, end-to-end FilterItem render test on the motivating case) still apply unchanged.

Verdict: 0 blocking. 3 low-priority threads remain open as-is.

Details
Thread Status
1. NULL-safety in build_sql_and_params (medium) Resolved earlier in 0ee60a3
2. deserialize_any int/float split test (low) Still applies — no test added in deserializer.rs
3. format_number integral cutoff / JS-compat (low) Still applies — unchanged
4. filter_debug.rs quoting bool/num (low) Resolved in 527a8c9to_debug_string + tests
5. End-to-end values: [true] / [42] test (low) Partially addressed at the debug-render level; a FilterItem → SQL+params round-trip test would still be valuable

FilterValue::to_debug_string and to_param_string are close cousins; consider whether the param-rendering path could share the integral/float formatting logic (currently only to_debug_string reuses format_number). Minor — not blocking.

· `feat/tesseract-bool-numeric-fitler-values`

@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @ovr's task in 47s —— View job


Re-review — chore: tests (a39a58b)

This commit closes out thread #5 from the original review with both pieces I asked for: serde round-trip tests for FilterValue across all variants (Str, Bool, Num int/float, Null, mixed array), and end-to-end FilterItem → rendered-SQL+params tests covering the exact motivating cases (values: [true], [false], [42], [42.5], [1,2,3], [100]) across equals / notEquals / in / gt. The equals boolean test pins ("visitors".is_active = $_0_$::boolean) with param "true" — exactly the regression the PR fixes. Resolved that thread.

Verdict: 0 blocking. Two low-priority threads remain open and unchanged.

Details
Thread Status
1. NULL-safety in build_sql_and_params (medium) Resolved in 0ee60a3
2. deserialize_any int/float split test in deserializer.rs (low) Still applies — the new tests live at the FilterValue serde layer, not the native deserialize_any path. NaN / very-large-integral coverage in cubenativeutils/src/wrappers/serializer/deserializer.rs would still lock that contract
3. format_number integral cutoff / JS-compat (low) Still applies — unchanged
4. filter_debug.rs quoting bool/num (low) Resolved in 527a8c9
5. End-to-end values: [true] / [42] test (low) Resolved in a39a58b — serde tests in base_query_options.rs + 7 SQL render tests in to_sql.rs

Notes

  • test_not_equals_boolean_native correctly verifies the null-safe <> ... OR ... IS NULL rendering for notEquals on a boolean column — good coverage of an easy-to-miss edge.
  • to_param_string_renders_canonical_form includes the nice contract pin: FilterValue::Num(42.0).to_param_string() == Some("42") (not "42.0") — confirms the format_number integral-fast-path is exercised. Thread ResultSet pivot methods documentation  #3's cutoff concern would now be straightforward to add a unit case for if desired.
  • Tests at both layers (serde + rendered SQL) is the right shape — catches regressions either above or below the typed-filter boundary.
· `feat/tesseract-bool-numeric-fitler-values`

@ovr ovr merged commit 789333d into master Jun 22, 2026
123 of 126 checks passed
@ovr ovr deleted the feat/tesseract-bool-numeric-fitler-values branch June 22, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants