Skip to content

Replace sscanf datetime parsing with Go-inspired byte-arithmetic#271

Open
schlubbi wants to merge 4 commits intotrilogy-libraries:mainfrom
schlubbi:schlubbi/fast-datetime-parser
Open

Replace sscanf datetime parsing with Go-inspired byte-arithmetic#271
schlubbi wants to merge 4 commits intotrilogy-libraries:mainfrom
schlubbi:schlubbi/fast-datetime-parser

Conversation

@schlubbi
Copy link

@schlubbi schlubbi commented Mar 18, 2026

Summary

Replace all three sscanf()-based datetime parsing blocks (DATETIME/TIMESTAMP, DATE, TIME) in cast.c with inline byte-arithmetic parsers inspired by Go's go-sql-driver/mysql parseDateTime() implementation, and replace rb_funcall(rb_cTime, :utc, 7, ...) Time object construction with Ruby's C-level rb_time_timespec_new() API.

Motivation

sscanf → byte-arithmetic parsing

sscanf() is a general-purpose format parser with overhead from format string interpretation, locale handling, and variadic argument processing. Go's MySQL driver avoids this entirely by operating directly on the byte buffer with simple arithmetic (byte - '0'), using length-based dispatch to determine the datetime format.

rb_funcall → rb_time_timespec_new (per byroot's suggestion)

rb_funcall(rb_cTime, :utc, 7, ...) goes through Ruby method dispatch, creates 7 INT2NUM VALUE arguments, unpacks them back to C integers in time_arg(), and converts to epoch via timegmw(). rb_time_timespec_new() skips all of that — we compute the epoch in C and hand it directly to the allocator.

For UTC epoch computation, uses Howard Hinnant's civil_to_days algorithm (the foundation of C++20 std::chrono::sys_days) which handles the full MySQL 1000–9999 year range. POSIX timegm() was considered but fails for tm_year < 0 on macOS; Ruby's own timegm_noleapsecond() handles this but is static internal.

What Changed

Byte-arithmetic parsing

New helpers in cast.c inspired by Go's bToi, parseByte2Digits, parseByteYear, parseByteNanoSec:

Function Purpose Go equivalent
byte_to_digit(b) Single ASCII char → int bToi()
parse_2digits(p) 2-byte parse (month, day, hour, min, sec) parseByte2Digits()
parse_4digits(p) 4-byte year parse parseByteYear()
parse_microseconds(p, len) 1-6 fractional digits → microseconds parseByteNanoSec()

Replaced parsing blocks:

  • DATETIME/TIMESTAMP: Length-based dispatch (10/19/21-26 bytes) with direct byte access at known offsets
  • DATE: Exact 10-byte parse
  • TIME: 8-15 byte parse with fractional seconds

C-level Time construction

Before After
rb_funcall(rb_cTime, id_utc, 7, INT2NUM(y), ...) rb_time_timespec_new(&ts, INT_MAX-1)
Method dispatch + 7 VALUE args + time_arg() + timegmw() Direct epoch → Time allocation
  • civil_to_epoch_utc() — Hinnant algorithm for UTC (portable, no timegm)
  • mktime() — standard C for local time with DST handling
  • trilogy_make_time() — unified helper for both DATETIME and TIME paths
  • Removed unused id_local, id_localtime, id_utc

Tests

20 new test cases covering:

  • Fractional seconds with 1-6 digits of precision (Go-ported)
  • Zero date/datetime handling
  • Specific date and datetime parsing validation
  • TIME column with microsecond precision
  • Epoch edge cases: Unix epoch zero, pre-1970 dates, leap year Feb 29, non-leap century (1900), far future (9999), MySQL minimum year (1000), leap year with local timezone

Profile Results

CPU-time isolated benchmark (per-row incremental casting cost, GC disabled):

Original (sscanf + rb_funcall):    ~208 ns/row
After byte-arithmetic parsing:      ~75 ns/row  (2.7× faster)
After rb_time_timespec_new:          ~3 ns/row  (72× faster)

The casting overhead is now indistinguishable from no-casting in benchmark-ips measurements — all datetime benchmarks report "same-ish: difference falls within error."

Files

File Change
contrib/ruby/ext/trilogy-ruby/cast.c Replace sscanf + rb_funcall with byte-arithmetic + C-level Time API
contrib/ruby/test/cast_test.rb Add 20 test cases (Go-ported + epoch edge cases)

Test Results

45 runs, 164 assertions, 0 failures, 0 errors, 0 skips

@byroot
Copy link
Collaborator

byroot commented Mar 18, 2026

The remaining casting cost is dominated by rb_funcall(rb_cTime, id_utc, 7, ...) at ~106 ns — Ruby's Time object construction, which is irreducible without changing the API contract.

There are a number of C-level APIs to construct ruby time objects: https://github.com/ruby/ruby/blob/45dbc5a4a24cac30771b8c8353abbbfd35fa86b8/include/ruby/internal/intern/time.h

@schlubbi
Copy link
Author

The remaining casting cost is dominated by rb_funcall(rb_cTime, id_utc, 7, ...) at ~106 ns — Ruby's Time object construction, which is irreducible without changing the API contract.

There are a number of C-level APIs to construct ruby time objects: https://github.com/ruby/ruby/blob/45dbc5a4a24cac30771b8c8353abbbfd35fa86b8/include/ruby/internal/intern/time.h

Thanks for the pointer! addressed in a81c3e6

schlubbi and others added 3 commits March 18, 2026 17:54
Add script/benchmark_datetime with two measurement strategies:

1. benchmark/ips comparison: cast vs no-cast for each column type
   (DATETIME, DATE, TIME) across 8192 rows
2. CPU-time micro-benchmark: uses Process.clock_gettime(CLOCK_PROCESS_CPUTIME_ID)
   with GC disabled to isolate parsing cost from network I/O

The benchmark revealed that sscanf-based datetime parsing costs ~208 ns/row
in CPU time, while Ruby Time.utc construction costs ~1,100 ns total.
Network-bound IPS benchmarks mask the difference (~65ms/query I/O dominates).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace all three sscanf()-based datetime parsing blocks (DATETIME/TIMESTAMP,
DATE, TIME) in cast.c with inline byte-arithmetic parsers inspired by Go's
go-sql-driver/mysql parseDateTime() implementation.

The key insight from Go's approach: sscanf() is a general-purpose format
parser with overhead from format string interpretation, locale handling, and
variadic argument processing. Direct byte arithmetic on the wire buffer
eliminates all of this.

Changes:
- Add 4 static inline helpers: byte_to_digit, parse_2digits, parse_4digits,
  parse_microseconds (inspired by Go's bToi, parseByte2Digits, parseByteYear,
  parseByteNanoSec from utils.go:208-228)
- DATETIME/TIMESTAMP: length-based dispatch (10/19/21-26 bytes) with direct
  byte access at known offsets, replacing sscanf + memcpy + string padding
- DATE: exact 10-byte parse, replacing sscanf + cstr_from_value
- TIME: 8-15 byte parse with fractional seconds, replacing sscanf
- Microsecond padding uses arithmetic (descending multiplier 100000->1)
  instead of string pad loop + atoi
- 13 new test cases ported from Go's TestParseDateTime (utils_test.go:352-520)
  covering fractional 1-6 digits, zero dates, specific date/datetime values,
  and TIME precision

Benchmark results (CPU-time isolated, 4096 rows x 200 iterations, GC disabled):
  Before (sscanf):          ~208 ns/row
  After (byte-arithmetic):   ~75 ns/row
  Improvement:              ~2.7x faster per-row datetime casting

Go reference implementation:
  https://github.com/go-sql-driver/mysql/blob/master/utils.go#L108-L228
  https://github.com/go-sql-driver/mysql/blob/master/utils_test.go#L352-L520

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use Ruby's C-level rb_time_timespec_new() API to construct Time objects
directly from epoch seconds, bypassing rb_funcall(rb_cTime, :utc, 7, ...)
method dispatch entirely.

This eliminates per-row:
- Ruby method lookup and dispatch overhead
- 7x INT2NUM VALUE creation
- Ruby's internal time_arg() argument unpacking
- Ruby's internal timegmw() calendar-to-epoch conversion

For UTC, uses Howard Hinnant's civil_to_days algorithm (the foundation
of C++20 std::chrono) to convert civil date components to epoch seconds.
This handles the full MySQL 1000-9999 year range, unlike POSIX timegm()
which fails for tm_year < 0 on macOS.

For local time, uses standard mktime() for system timezone resolution.

Adds 7 edge-case tests exercising the epoch conversion: Unix epoch zero,
pre-1970 dates, leap year Feb 29, non-leap century (1900), far future
(9999), MySQL minimum year (1000), and leap year with local timezone.

Per-row datetime casting overhead drops from ~75 ns to ~3 ns, making
cast-vs-no-cast indistinguishable in benchmark-ips measurements.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@byroot byroot force-pushed the schlubbi/fast-datetime-parser branch from a81c3e6 to 641c629 Compare March 18, 2026 16:54
Copy link
Collaborator

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that change a lot.

I think we should just remove a bit of boiler plate from the tests, and I'm not convinced it's really worth committing the benchmark. But other than that 👍

@composerinteralia any opinions?

@byroot
Copy link
Collaborator

byroot commented Mar 18, 2026

Ah and I ran the extra tests on main, to make sure the behavior didn't change.

Either way, not having to do a method dispatch is a huge win.

Address review feedback: use existing trilogy_test columns (date_time_test,
date_test) instead of creating per-test tables for tests that don't need
custom column types. Keeps custom table only for DATETIME(6) and TIME(6)
fractional precision tests.

Remove benchmark script from the committed files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@schlubbi schlubbi force-pushed the schlubbi/fast-datetime-parser branch from ee7ae3e to 6a80704 Compare March 18, 2026 19:36
@schlubbi
Copy link
Author

I like that change a lot.

I think we should just remove a bit of boiler plate from the tests, and I'm not convinced it's really worth committing the benchmark. But other than that 👍

🎈
Bumped the precision of the existing datetime/time columns and am reusing them in the tests now. Benchmark script is gone from the PR too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants