Skip to content

clickhouse: fix date overflow for initial load#4534

Draft
dtunikov wants to merge 5 commits into
mainfrom
fix/dbi-812/mysql-datetime-date-clickhouse-range
Draft

clickhouse: fix date overflow for initial load#4534
dtunikov wants to merge 5 commits into
mainfrom
fix/dbi-812/mysql-datetime-date-clickhouse-range

Conversation

@dtunikov

@dtunikov dtunikov commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@dtunikov dtunikov requested a review from a team as a code owner July 3, 2026 14:48
// Timestamps are micros since epoch; dates are days since epoch. Both encoded as Avro varint.
avroVal := t.UnixMicro()
if isDate {
avroVal = t.Unix() / 86400

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to move it here out of process[type] functions because we clamp t value for clickhouse

@claude

claude Bot commented Jul 3, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@dtunikov dtunikov marked this pull request as draft July 3, 2026 15:05
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

❌ 8 Tests Failed:

Tests completed Failed Passed Skipped
2961 8 2953 346
View the top 3 failed test(s) by shortest run time
github.com/PeerDB-io/peerdb/flow/e2e::TestApiPg
Stack Traces | 0.01s run time
=== RUN   TestApiPg
=== PAUSE TestApiPg
=== CONT  TestApiPg
--- FAIL: TestApiPg (0.01s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMySQL_CH
Stack Traces | 0.02s run time
=== RUN   TestPeerFlowE2ETestSuiteMySQL_CH
=== PAUSE TestPeerFlowE2ETestSuiteMySQL_CH
=== CONT  TestPeerFlowE2ETestSuiteMySQL_CH
--- FAIL: TestPeerFlowE2ETestSuiteMySQL_CH (0.02s)
2026/07/03 16:09:03 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:09:03 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMySQL_CH_Cluster
Stack Traces | 0.02s run time
=== RUN   TestPeerFlowE2ETestSuiteMySQL_CH_Cluster
=== PAUSE TestPeerFlowE2ETestSuiteMySQL_CH_Cluster
=== CONT  TestPeerFlowE2ETestSuiteMySQL_CH_Cluster
--- FAIL: TestPeerFlowE2ETestSuiteMySQL_CH_Cluster (0.02s)
2026/07/03 16:10:36 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:10:36 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_machcl_56lliq6k.test_datetime
github.com/PeerDB-io/peerdb/flow/e2e::TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew
Stack Traces | 7.35s run time
=== RUN   TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew
=== PAUSE TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew
=== CONT  TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew
2026/07/03 16:09:22 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
    api_test.go:3558: WaitFor wait for flow to be running 2026-07-03 16:09:29.665184871 +0000 UTC m=+795.135166527
    api_test.go:3569: 
        	Error Trace:	.../flow/e2e/api_test.go:3569
        	            				.../hostedtoolcache/go/1.26.4.../src/runtime/asm_amd64.s:1771
        	Error:      	"1" is not greater than "1"
        	Test:       	TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew
        	Messages:   	Should have multiple executions (continue-as-new happened)
    api_test.go:51: begin tearing down postgres schema api_y44hy1gm
--- FAIL: TestApiPg/TestCreateCDCFlowAttachIdempotentAfterContinueAsNew (7.35s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range
Stack Traces | 20.1s run time
=== RUN   TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range
=== PAUSE TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range
=== CONT  TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range
2026/07/03 16:11:19 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:11:19 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:11:19 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mach_74o2rt2x.test_schema_as_column
    clickhouse_mysql_test.go:1802: WaitFor waiting on snapshot 2026-07-03 16:11:23.762250537 +0000 UTC m=+854.019837888
    clickhouse_mysql_test.go:1806: WaitFor waiting on cdc 2026-07-03 16:11:23.767628421 +0000 UTC m=+854.025215752
2026/07/03 16:11:24 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mach_xlou416q.test_mysql_schema_changes
    clickhouse_mysql_test.go:1822: 
        	Error Trace:	.../flow/e2e/clickhouse_mysql_test.go:1822
        	            				.../flow/e2e/clickhouse_mysql_test.go:1837
        	            				.../hostedtoolcache/go/1.26.4.../src/runtime/asm_amd64.s:1771
        	Error:      	Not equal: 
        	            	expected: "2299-12-31 05:30:45"
        	            	actual  : "1715-06-12 05:56:11.290448"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-2299-12-31 05:30:45
        	            	+1715-06-12 05:56:11.290448
        	Test:       	TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range
--- FAIL: TestPeerFlowE2ETestSuiteMariaDB_CH/Test_MySQL_DateTime_ClickHouse_Range (20.15s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range
Stack Traces | 20.1s run time
=== RUN   TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range
=== PAUSE TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range
=== CONT  TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range
2026/07/03 16:07:32 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:07:32 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:07:32 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mych_r4luq3vm.test_float
    clickhouse_mysql_test.go:1802: WaitFor waiting on snapshot 2026-07-03 16:07:36.249993597 +0000 UTC m=+626.507580908
    clickhouse_mysql_test.go:1806: WaitFor waiting on cdc 2026-07-03 16:07:36.25501873 +0000 UTC m=+626.512606051
2026/07/03 16:07:36 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mych_r4luq3vm.test_float
    clickhouse_mysql_test.go:1822: 
        	Error Trace:	.../flow/e2e/clickhouse_mysql_test.go:1822
        	            				.../flow/e2e/clickhouse_mysql_test.go:1837
        	            				.../hostedtoolcache/go/1.26.4.../src/runtime/asm_amd64.s:1771
        	Error:      	Not equal: 
        	            	expected: "2299-12-31 05:30:45"
        	            	actual  : "1715-06-12 05:56:11.290448"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-2299-12-31 05:30:45
        	            	+1715-06-12 05:56:11.290448
        	Test:       	TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range
--- FAIL: TestPeerFlowE2ETestSuiteMySQL_CH/Test_MySQL_DateTime_ClickHouse_Range (20.15s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
Stack Traces | 20.3s run time
=== RUN   TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
=== PAUSE TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
=== CONT  TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
2026/07/03 16:10:37 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:10:37 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:10:38 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_machcl_cupyfhmp.test_update_pkey_chunking_initial_load_enabled
    clickhouse_mysql_test.go:1802: WaitFor waiting on snapshot 2026-07-03 16:10:42.907809868 +0000 UTC m=+813.165397199
    clickhouse_mysql_test.go:1806: WaitFor waiting on cdc 2026-07-03 16:10:42.915057578 +0000 UTC m=+813.172644909
2026/07/03 16:10:42 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_machcl_t2swgqqc.test_unsigned
    clickhouse_mysql_test.go:1822: 
        	Error Trace:	.../flow/e2e/clickhouse_mysql_test.go:1822
        	            				.../flow/e2e/clickhouse_mysql_test.go:1837
        	            				.../hostedtoolcache/go/1.26.4.../src/runtime/asm_amd64.s:1771
        	Error:      	Not equal: 
        	            	expected: "2299-12-31 05:30:45"
        	            	actual  : "1715-06-12 05:56:11.290448"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-2299-12-31 05:30:45
        	            	+1715-06-12 05:56:11.290448
        	Test:       	TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
--- FAIL: TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range (20.26s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
Stack Traces | 20.3s run time
=== RUN   TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
=== PAUSE TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
=== CONT  TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
2026/07/03 16:09:26 INFO Received AWS credentials from peer for connector: ci x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
2026/07/03 16:09:26 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mychcl_jwd2c3fw.test_txn_payload
2026/07/03 16:09:26 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mychcl_49ep7gm6.test_datetime
2026/07/03 16:09:26 INFO Received AWS credentials from peer for connector: clickhouse x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN}
    clickhouse_mysql_test.go:1802: WaitFor waiting on snapshot 2026-07-03 16:09:31.521235768 +0000 UTC m=+741.778823090
    clickhouse_mysql_test.go:1806: WaitFor waiting on cdc 2026-07-03 16:09:31.529101978 +0000 UTC m=+741.786689309
2026/07/03 16:09:31 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_mychcl_49ep7gm6.test_datetime
    clickhouse_mysql_test.go:1822: 
        	Error Trace:	.../flow/e2e/clickhouse_mysql_test.go:1822
        	            				.../flow/e2e/clickhouse_mysql_test.go:1837
        	            				.../hostedtoolcache/go/1.26.4.../src/runtime/asm_amd64.s:1771
        	Error:      	Not equal: 
        	            	expected: "2299-12-31 05:30:45"
        	            	actual  : "1715-06-12 05:56:11.290448"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-2299-12-31 05:30:45
        	            	+1715-06-12 05:56:11.290448
        	Test:       	TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range
--- FAIL: TestPeerFlowE2ETestSuiteMySQL_CH_Cluster/Test_MySQL_DateTime_ClickHouse_Range (20.26s)
View the full list of 2 ❄️ flaky test(s)
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMariaDB_CH

Flake rate in main: 14.29% (Passed 24 times, Failed 4 times)

Stack Traces | 0.02s run time
=== RUN   TestPeerFlowE2ETestSuiteMariaDB_CH
=== PAUSE TestPeerFlowE2ETestSuiteMariaDB_CH
=== CONT  TestPeerFlowE2ETestSuiteMariaDB_CH
--- FAIL: TestPeerFlowE2ETestSuiteMariaDB_CH (0.02s)
2026/07/03 16:12:34 INFO fetched schema x-peerdb-additional-metadata={Operation:FLOW_OPERATION_UNKNOWN} table=e2e_test_api_1wzkvytp.t1
github.com/PeerDB-io/peerdb/flow/e2e::TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster

Flake rate in main: 14.29% (Passed 24 times, Failed 4 times)

Stack Traces | 0.02s run time
=== RUN   TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster
=== PAUSE TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster
=== CONT  TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster
--- FAIL: TestPeerFlowE2ETestSuiteMariaDB_CH_Cluster (0.02s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

❌ Test Failure

Analysis: A deterministic assertion failure in Test_MySQL_DateTime_ClickHouse_Range (expected nil but got 1900-01-01) reproduces identically across all MySQL/MariaDB matrix variants, indicating the PR's own date-overflow fix is incomplete rather than a flaky failure.
Confidence: 0.97

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

❌ Test Failure

Analysis: Real bug: the PR's own test Test_MySQL_DateTime_ClickHouse_Range deterministically fails across all 4 CH suites and both matrix jobs with a clear value mismatch (out-of-range nullable DATE replicated as 1900-01-01 instead of NULL), with no timeout/race/network signals.
Confidence: 0.97

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

parseDateTime64BestEffortOrNull (used by the CDC normalize path) saturates
out-of-range values to ClickHouse's [1900, 2299] boundary, clamping the date
while preserving the time-of-day, and never nulls. Make the snapshot/Avro path
do the exact same thing so snapshot and CDC stay consistent for the same source
value, regardless of column nullability.

Verified the Go clamp matches parseDateTime64BestEffortOrNull output exactly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

❌ Test Failure

Analysis: The PR's own new test Test_MySQL_DateTime_ClickHouse_Range fails deterministically with a logic assertion ("expected out-of-range nullable DATE to be NULL") across all 4 ClickHouse suite variants and all 3 matrix jobs, indicating the date-overflow fix is genuinely broken rather than flaky.
Confidence: 0.97

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

❌ Test Failure

Analysis: Real bug: Test_MySQL_DateTime_ClickHouse_Range fails deterministically across every matrix leg with an identical out-of-range datetime overflow (expected 2299-12-31 but got wrapped-around 1715-06-12), indicating a MySQL→ClickHouse datetime range-boundary conversion defect, not a flaky/timeout/race failure.
Confidence: 0.95

⚠️ This appears to be a real bug - manual intervention needed

View workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant