PG19: port ruleutils into Citus as ruleutils_19.c#8602
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## pg19-support #8602 +/- ##
===============================================
Coverage ? 88.75%
===============================================
Files ? 288
Lines ? 64316
Branches ? 8091
===============================================
Hits ? 57086
Misses ? 4890
Partials ? 2340 🚀 New features to boost your workflow:
|
c1df190 to
ab238fc
Compare
585c6a7 to
23f44f2
Compare
There was a problem hiding this comment.
Pull request overview
Ports PostgreSQL 19’s ruleutils.c into Citus’ deparser tree (ruleutils_19.c) and extends the existing PG-compat layer so Citus can compile and deparse correctly against a PG19 backend (as part of the PG19 support work tracked in #8597).
Changes:
- Add upstream-derived
ruleutils_19.cand exclude it fromcitus-styleformatting rules. - Extend PG19 compatibility shims and conditional compilation across planner/deparser/executor/lock-graph/columnar code paths.
- Update CI/build/test-version matrices and version constants to recognize PG19 (and define PG20 constant for guards).
Reviewed changes
Copilot reviewed 75 out of 77 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/test/regress/citus_tests/common.py | Update oldest-supported Citus version mapping for PG18/PG19. |
| src/include/pg_version_constants.h | Add PG_VERSION_20 constant for version-guarding. |
| src/include/pg_version_compat.h | Add PG19 compatibility shims and central includes. |
| src/include/distributed/version_compat.h | Add missing includes needed under PG19 header/include changes. |
| src/include/distributed/stats/stat_tenants.h | Include PG19-required headers and compat header. |
| src/include/distributed/shardsplit_shared_memory.h | Include shmem/lwlock headers and compat header for PG19. |
| src/include/distributed/resource_lock.h | Bump static assertion upper bound to PG19. |
| src/include/distributed/executor_util.h | Include compat header for PG19-related definitions. |
| src/include/distributed/distributed_planner.h | Add PG19 ExplainState* parameter to planner hook prototype. |
| src/include/distributed/deparser.h | Include compat header to support PG19-related deparser types/macros. |
| src/include/columnar/columnar_version_compat.h | Add PG19 PageSetChecksumInplace rename shim for columnar. |
| src/include/citus_config.h.in | Refresh config header template macros/comments. |
| src/backend/distributed/utils/tuplestore.c | Add explicit tuplestore.h include for PG19 include cleanup. |
| src/backend/distributed/utils/shard_utils.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/utils/resource_lock.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/utils/replication_origin_session_utils.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/utils/reference_table_utils.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/utils/namespace_utils.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/utils/maintenanced.c | Adjust shmem init call site (size argument) for PG19 behavior. |
| src/backend/distributed/utils/log_utils.c | Use PG19 log_min_messages per-backend-type array behavior. |
| src/backend/distributed/utils/foreign_key_relationship.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/utils/enable_ssl.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/utils/directory.c | Add storage/fd.h include for PG19 include cleanup. |
| src/backend/distributed/utils/citus_nodefuncs.c | Handle new PG19 RTE kind (RTE_GRAPH_TABLE). |
| src/backend/distributed/utils/cancel_utils.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/utils/background_jobs.c | Fix empty-parameter prototypes and use pg_fallthrough. |
| src/backend/distributed/transaction/worker_transaction.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/transaction/transaction_management.c | Fix function pointer prototype and empty-parameter prototype ((void)). |
| src/backend/distributed/transaction/relation_access_tracking.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/transaction/lock_graph.c | Adjust PGPROC wait-queue link member name for PG19. |
| src/backend/distributed/transaction/backend_data.c | Mark trancheName unused under PG19 tranche API changes. |
| src/backend/distributed/test/run_from_same_connection.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/test/fake_am.c | Update AM callback signatures for PG19 API changes. |
| src/backend/distributed/stats/stat_tenants.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/shared_library_init.c | Guard removed PG19 hook, refactor shmem request sizing and PG19-specific slack, update enum lookup helper. |
| src/backend/distributed/shardsplit/shardsplit_shared_memory.c | Fix empty-parameter prototypes and mark trancheName unused under PG19 tranche API changes. |
| src/backend/distributed/shardsplit/shardsplit_decoder.c | Provide InvalidRepOriginId fallback define when missing. |
| src/backend/distributed/relay/relay_event_utility.c | Adjust Cluster/RepackStmt relation name field access under PG19. |
| src/backend/distributed/planner/tdigest_extension.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/planner/multi_router_planner.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/planner/multi_physical_planner.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/planner/multi_logical_optimizer.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/planner/multi_explain.c | Use NodeInstrumentation on PG19; adjust memset sizing. |
| src/backend/distributed/planner/intermediate_result_pruning.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/planner/distributed_planner.c | Add PG19 ExplainState* parameter to planner hook implementation signature. |
| src/backend/distributed/operations/worker_node_manager.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/operations/shard_split.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/operations/shard_rebalancer.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/operations/shard_cleaner.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/operations/node_protocol.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/metadata/pg_get_object_address_16_17_18.c | Add PG19 OBJECT_PROPGRAPH handling behind version guard. |
| src/backend/distributed/metadata/node_metadata.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/metadata/metadata_utility.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/metadata/metadata_sync.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/metadata/metadata_cache.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/executor/multi_executor.c | Switch to InstrStart/InstrStop on PG19; other PG19 executor compat adjustments. |
| src/backend/distributed/deparser/deparse_table_stmts.c | Replace comment fallthrough with pg_fallthrough. |
| src/backend/distributed/connection/connection_configuration.c | Fix empty-parameter prototypes ((void)). |
| src/backend/distributed/commands/utility_hook.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/commands/table.c | Replace comment fallthrough with pg_fallthrough. |
| src/backend/distributed/commands/schema.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/commands/multi_copy.c | Replace comment fallthrough with pg_fallthrough. |
| src/backend/distributed/commands/extension.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/commands/dependencies.c | Fix empty-parameter prototype ((void)). |
| src/backend/distributed/commands/cluster.c | Adjust Cluster/RepackStmt relation access under PG19. |
| src/backend/distributed/cdc/cdc_decoder.c | Include compat header; fix empty-parameter prototype ((void)). |
| src/backend/distributed/cdc/cdc_decoder_utils.c | Fix empty-parameter prototype ((void)). |
| src/backend/columnar/columnar.c | Fix empty-parameter prototype ((void)). |
| src/backend/columnar/columnar_tableam.c | PG19 AM/vacuum/analyze/pgstat API adjustments; include fixes; detoast macro fix. |
| src/backend/columnar/columnar_reader.c | Fix empty-parameter prototype ((void)). |
| src/backend/columnar/columnar_metadata.c | Fix empty-parameter prototype ((void)). |
| src/backend/columnar/columnar_customscan.c | Guard removed PG19 get_relation_info_hook; fix empty-parameter prototype ((void)). |
| configure.ac | Allow PG19 in configure-time version check. |
| .github/workflows/build_and_test.yml | Update PG minor versions; add PG19 to build matrix and upgrade version list. |
| .gitattributes | Exclude ruleutils_19.c from citus-style. |
| src/backend/distributed/deparser/ruleutils_19.c | Add PG19-specific ruleutils port for correct PG19 deparse semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (totalTime) | ||
| { | ||
| #if PG_VERSION_NUM >= PG_VERSION_19 | ||
| InstrStart(totalTime); | ||
| #else | ||
| InstrStartNode(totalTime); | ||
| #endif | ||
| } |
There was a problem hiding this comment.
Good catch on the timing field. In PG19beta1 the field is renamed to query_instr, but its type is still Instrumentation * (struct Instrumentation *query_instr; in executor/execdesc.h), so there's no incompatible-pointer-types issue and InstrStart/InstrStop take the same pointer type. Rather than version-split the declaration, I removed the fragile blanket #define totaltime query_instr and introduced a scoped QueryDescTotalTime(qd) accessor (per @colm-mchugh's related suggestion below), and converted the call sites in this function to it. Verified building clean under -Werror on PG17, PG18 and PG19beta1.
| #if PG_VERSION_NUM >= PG_VERSION_19 | ||
| currentProc = (PGPROC *) currentProc->waitLink.next; | ||
| #else | ||
| currentProc = (PGPROC *) currentProc->links.next; | ||
| #endif |
There was a problem hiding this comment.
Removed. The trailing currentProc = (PGPROC *) currentProc->...next; at the end of the dclist_foreach body was dead code — dclist_foreach reassigns the loop variable from the iterator on each iteration, so the manual advance had no effect. Dropped it on both the PG19 and pre-PG19 branches.
| (MaintenanceDaemonControlData *) ShmemInitStruct("Citus Maintenance Daemon", | ||
| MaintenanceDaemonShmemSize(), | ||
| sizeof( | ||
| MaintenanceDaemonControlData) | ||
| , | ||
| &alreadyInitialized); |
There was a problem hiding this comment.
Cleaned up. Extracted Size maintenanceDaemonControlSize = sizeof(MaintenanceDaemonControlData); into a local and passed that to ShmemInitStruct, which removes the awkward wrapped sizeof(...) argument and the dangling-comma layout.
| * relay_event_utility.c. | ||
| */ | ||
| typedef struct RepackStmt ClusterStmt; | ||
| #define T_ClusterStmt T_RepackStmt |
There was a problem hiding this comment.
Does PG19 emit both VACUUM FULL and REPACK as a T_RepackStmt ? If so how will Citus dispatch handling differentiate the CLUSTER variant, given that T_ClusterStmt is now aliased to T_RepackStmt ? The comment suggests that will be handled in Phase2, but just want to check at least that my understanding is correct. And also note that GetDistributeObjectOps() in distribute_object_ops.c is impacted as well.
There was a problem hiding this comment.
Your understanding is correct: on PG19 CLUSTER, REPACK and VACUUM FULL all parse to T_RepackStmt, and because the Phase-1 shim aliases T_ClusterStmt to T_RepackStmt, the current dispatch cannot yet differentiate the CLUSTER variant. And yes — GetDistributeObjectOps() in distribute_object_ops.c is impacted as well.
This PR only carries the source-level type shim so the tree compiles; the actual command discrimination and dispatch (including GetDistributeObjectOps) is tracked as a dedicated follow-up: #8613. I've referenced #8613 from the shim comment.
There was a problem hiding this comment.
Shouldn't this access of clusterStmt->relation also discriminate PG19 version (per lines 54-60) ?
There was a problem hiding this comment.
Line 41 is just a NULL check on the pointer (clusterStmt->relation == NULL), which is valid regardless of whether the field is a RangeVar * (≤PG18) or a VacuumRelation * (PG19) — both are pointers, so no version discrimination is needed there. The dereference that actually depends on the type difference is the RangeVarGetRelid(...) at lines 54-60, which is already version-guarded.
The deeper semantic question — distinguishing CLUSTER vs REPACK vs VACUUM FULL now that they share the T_RepackStmt tag — is tracked in #8613.
There was a problem hiding this comment.
Hm, this may conflict with other declarations - e.g. totaltime in multi_explain.c. Making it an explicit access is probably more future-proof, at the expense of more code:
#define QueryDescTotalTime(qd) ((qd)->query_instr)
Then, change QueryDesc->totaltime uses, such as those in multi_executor.c, to use this macro instead.
There was a problem hiding this comment.
Agreed — done exactly as suggested. Replaced the blanket #define totaltime query_instr (which, as you note, would clash with unrelated identifiers like the local totaltime in multi_explain.c) with #define QueryDescTotalTime(qd) ((qd)->query_instr) on PG19, plus a matching ((qd)->totaltime) on ≤PG18 so call sites are uniform, and converted the QueryDesc->totaltime uses in multi_executor.c to the macro.
There was a problem hiding this comment.
Nit: any reason not to use PG_CACHE_LINE_SIZE macro here ? And can the calculation be based on a derived bound instead of what basically looks like a magic constant ? If not, can we add a comment justifying why 128 KB is a good upper bound for the current allocation count.
There was a problem hiding this comment.
Updated. Replaced the 128 * 1024 magic value with 1024 * PG_CACHE_LINE_SIZE — i.e. one cache line of slack per allocation, for an upper bound of ~1024 ShmemInitStruct/ShmemInitHash allocations across Citus' shared-memory components — with a comment explaining the derivation.
| } | ||
|
|
||
|
|
||
| #define standard_conforming_strings (citus_pg19_standard_conforming_strings()) |
There was a problem hiding this comment.
Hm, I think this means PG19 is doing a hash lookup and strcmp() on every reference of standard_conforming_strings, which before was a variable access. Can we retain that behavior for PG19 ? So something like:
bool standardConformingStrings = citus_pg19_standard_conforming_strings();
There was a problem hiding this comment.
You're right, thanks. This is read once per character inside simple_quote_literal(), so on PG19 the compat shim was doing a GetConfigOption() hash lookup + strcmp for every character of every string literal being deparsed. Fixed by hoisting the read into a local before the loop:
bool standardConformingStrings = standard_conforming_strings;The value cannot change within a single deparse, so the local copy is safe, and PG19 now does a single lookup per literal instead of one per character. (On ≤PG18 this just copies the global bool, so no behavioural change there.)
| * to pass the real tranche name directly to LWLockNewTrancheId(). | ||
| * Tracked in #8609. | ||
| */ | ||
| #define LWLockNewTrancheId() ((LWLockNewTrancheId) ("citus-deferred")) |
There was a problem hiding this comment.
So in phase 2, each caller will pass the actual tranche name in?
There was a problem hiding this comment.
Yes — in the follow-up each caller will pass its real tranche name directly to LWLockNewTrancheId(name) (PG19 internalised tranche registration and dropped the separate LWLockRegisterTranche), so the temporary "citus-deferred" placeholder name in this shim goes away. Tracked in #8609; I've referenced it from the comment.
There was a problem hiding this comment.
pg19 will be added to downstream test jobs in phase 2 ?
There was a problem hiding this comment.
Correct — enabling the PG19 downstream test matrices (test-citus, -failure, -cdc, test-arbitrary-configs, and the pg-upgrade pairs 18→19 / 16→19) is intentionally a separate change that lands after this ruleutils PR, and it also depends on the the-process test images being built for pg19beta1. Tracked in #8615.
DESCRIPTION: Port upstream PG19 ruleutils.c into Citus deparser tree. Mirrors #8010 (PG18) and #7725 (PG17). Replaces the ruleutils_19.c placeholder from the build-foundation PR with a proper port of upstream PG19's src/backend/utils/adt/ruleutils.c, produced by a 3-way merge (git merge-file: ruleutils_18.c as base, upstream PG18 and upstream PG19 as the two sides). All merge conflicts fell in blocks Citus had already stripped from _18.c or in shard-aware deparse variants, and were resolved by keeping the Citus side. The file keeps Citus' existing curation (deparse hooks, shard-aware UPDATE/DELETE branches) while picking up real upstream PG18->PG19 deparse changes. Without this, Citus on a PG19 backend would deparse worker-bound SQL using PG18 semantics -- silent corruption the moment a PG19-only node, clause, or formatting decision appears in the tree. The trailing #endif guard text is corrected to "(PG_VERSION_NUM >= PG_VERSION_19) && (PG_VERSION_NUM < PG_VERSION_20)" (the _17.c/_18.c snapshots carry stale guard comments; not propagated). The generate_function_name fgc_flags fix is absorbed naturally by the merge -- upstream PG19 already passes &fgc_flags at that call site. Excluded from citus-style via .gitattributes (added in the build-foundation PR): the file exceeds the CI uncrustify 10,000-line limit and is upstream-derived. Refs: #8597
- Replace the blanket `#define totaltime query_instr` with a scoped QueryDescTotalTime() accessor (plus a matching pre-PG19 macro) so the rename no longer risks rewriting unrelated identifiers; convert the multi_executor.c call sites accordingly. - Drop the dead currentProc advance at the end of the dclist_foreach loop in AddEdgesForWaitQueue. - Pass a named Size local to ShmemInitStruct in MaintenanceDaemonShmemInit. - Reserve 1024 * PG_CACHE_LINE_SIZE (instead of a magic 128 * 1024) for the PG19 shmem alignment slack, with an explanatory comment. - Point the columnar get_relation_info_hook TODO and the RepackStmt / LWLock tranche comments at their tracking issues (#8614, #8613, #8609).
23f44f2 to
db92c2e
Compare
colm-mchugh
left a comment
There was a problem hiding this comment.
Lgtm, thanks for addressing the comments. Did you run make check-multi with a build using PG19beta1 ? Or at least create a Citus instance and run a few basic commands, just to verify against a PG19 backend that there is no runtime defect (ruleutils produces invalid SQL. incorrect offset to a field...)
Thanks @colm-mchugh. I did not run with this branch. This branch still crashes when we check-multi however the fix for the crash is elsewhere, see #8617 . That shows that ruleutils is health. PRs are broken down this way for ease of reviews. So, #8616 enables tests for pg19 and #8617 fixes the fundamental crash. After that two we have more visibility on further fixes needed for pg19. |
DESCRIPTION: Port upstream PG19 ruleutils.c into Citus deparser tree.
Mirrors #8010 (PG18) and #7725 (PG17). Replaces the ruleutils_19.c
placeholder from the build-foundation PR with a proper port of upstream
PG19's src/backend/utils/adt/ruleutils.c, produced by a 3-way merge
(git merge-file: ruleutils_18.c as base, upstream PG18 and upstream
PG19 as the two sides). All merge conflicts fell in blocks Citus had
already stripped from _18.c or in shard-aware deparse variants, and were
resolved by keeping the Citus side. The file keeps Citus' existing
curation (deparse hooks, shard-aware UPDATE/DELETE branches) while
picking up real upstream PG18->PG19 deparse changes.
Without this, Citus on a PG19 backend would deparse worker-bound SQL
using PG18 semantics -- silent corruption the moment a PG19-only node,
clause, or formatting decision appears in the tree.
The trailing #endif guard text is corrected to
"(PG_VERSION_NUM >= PG_VERSION_19) && (PG_VERSION_NUM < PG_VERSION_20)"
(the _17.c/_18.c snapshots carry stale guard comments; not propagated).
The generate_function_name fgc_flags fix is absorbed naturally by the
merge -- upstream PG19 already passes &fgc_flags at that call site.
Excluded from citus-style via .gitattributes (added in the
build-foundation PR): the file exceeds the CI uncrustify 10,000-line
limit and is upstream-derived.
Refs: #8597
Stacked on #8601 (
pg19-build-foundation).