fix(mysql): align JSON representation between snapshot and CDC#4535
fix(mysql): align JSON representation between snapshot and CDC#4535dtunikov wants to merge 5 commits into
Conversation
MySQL JSON columns reached PeerDB through two different renderers that disagreed on representation: - Snapshot (text protocol): MySQL server-rendered JSON, type-faithful (DOUBLE 1.0 stays "1.0", JSONB key order preserved) but with a space after ':' and ','. - CDC (binlog JSONB decoder): historically lossy -- 1.0 collapsed to 1, keys reordered by Go map iteration. Enable RenderJSONAsMySQLText on the binlog syncer so CDC emits type-faithful, key-order-preserving, compact JSON, and json.Compact the snapshot text form so both paths are byte-consistent. MariaDB JSON is LONGTEXT-backed (QValueKindString), so it never hits the compaction path and is unaffected. Adds a unit test for the compaction helper and an e2e test asserting the snapshot and CDC representations of the same JSON value match. DBI-823 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| DisableRetrySync: true, | ||
| UseDecimal: true, | ||
| ParseTime: true, | ||
| RenderJSONAsMySQLText: true, |
There was a problem hiding this comment.
this one was added (really weird git diff because of go fmt)
we can also put it behind mirror version gate to ensure that the old mirrors continue working the same way (in case someone relies on default parsing logic..)
❌ 4 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
❌ Test FailureAnalysis: The PR's own new test Test_MySQL_JSON_SnapshotCDCConsistency deterministically fails with "Not equal: unexpected JSON representation at row 1" on the mysql-pos config (passing on both mysql-gtid configs), indicating the JSON snapshot/CDC mismatch the PR aims to fix is still present on that path — a real, config-dependent bug, not flakiness. |
❌ Test FailureAnalysis: A deterministic value-equality assertion ("unexpected JSON representation at row 1") fails in the very test (Test_MySQL_JSON_SnapshotCDCConsistency) that validates this PR's own fix for JSON snapshot/CDC alignment, indicating the fix is incomplete rather than a flaky failure. |
❌ Test FailureAnalysis: The PR's own new test Test_MySQL_JSON_SnapshotCDCConsistency deterministically fails a fixed JSON value assertion (expected "1.0", got "1") in both standalone and cluster variants, a real snapshot/CDC representation-mismatch bug the PR aims to fix — not a timeout, race, or network flake. |
MySQL normalizes whole-number JSON doubles (1.0 -> 1) on storage, so the previous exact-literal assertion was wrong. Rewrite the e2e test around the invariant that actually matters for DBI-823: the snapshot and CDC representations of the same JSON value must be byte-identical. Center the variants on object key ordering, which is where the old CDC path (Go-map lexicographic order) diverged from MySQL's (length-then-byte) order. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
❌ Test FailureAnalysis: A deterministic value mismatch in Test_MySQL_JSON_SnapshotCDCConsistency (JSON float serialized as "1.0000005e+06" instead of "1000000.5") that reproduced identically across multiple matrix jobs and directly reflects the unfixed bug this PR targets — not a flaky failure. |
❌ Test FailureAnalysis: A real, deterministic failure: the PR's own new test Test_MySQL_JSON_SnapshotCDCConsistency fails a JSON float-representation equality assertion identically across both MySQL_CH and MySQL_CH_Cluster variants, indicating the PR's snapshot/CDC JSON-alignment fix is incomplete rather than any flakiness. |
Problem (DBI-823)
MySQL
JSONcolumns reached PeerDB through two different renderers that disagreed on representation:DOUBLE 1.0stays1.0, JSONB key order preserved) but with a space after:and,(e.g.{"a": 1.0}).1.0collapsed to1, object keys reordered by Go map iteration.The same logical value therefore landed in the destination with different text depending on whether it arrived via initial load or CDC — a real mismatch when JSON is stored as a
String(the default in ClickHouse).Fix
RenderJSONAsMySQLText: trueon the binlog syncer (startSyncer). The pinned go-mysql fork's decoder then emits type-faithful, key-order-preserving, compact JSON.json.Compactthe snapshot text-protocol form so it matches the compact CDC output byte for byte.json.Compactonly elides structural whitespace — number literals (1.0), string bytes, and key order are preserved; invalid JSON falls back to the raw text.MariaDB is unaffected: its
JSONisLONGTEXT-backed, so values map toQValueKindString/Bytesand never hit the JSON compaction path, andRenderJSONAsMySQLTextonly affects MySQL JSONB decoding.Residual (documented) corners
Full byte-identity still can't hold for a couple of go-mysql-documented cases: float exponent form (
1.5e-05vs MySQL's1.5e-5) and theNEWDECIMALopaque type tag. The numeric values round-trip; only the text differs.Tests
TestCompactMySQLJSON— unit test for the compaction helper (whitespace elision,1.0preservation, key-order preservation, in-string whitespace, invalid-JSON fallback).Test_MySQL_JSON_SnapshotCDCConsistency— e2e test inserting JSON variants during both snapshot and CDC, asserting the destination representations match (and, for MySQL, equal the expected compact type-faithful form).🤖 Generated with Claude Code