cf 5556 by gburd · Pull Request #16 · gburd/postgres

gburd · 2025-10-24T18:01:53Z

No description provided.

- Hourly upstream sync from postgres/postgres (24x daily) - AI-powered PR reviews using AWS Bedrock Claude Sonnet 4.5 - Multi-platform CI via existing Cirrus CI configuration - Cost tracking and comprehensive documentation Features: - Automatic issue creation on sync conflicts - PostgreSQL-specific code review prompts (C, SQL, docs, build) - Cost limits: $15/PR, $200/month - Inline PR comments with security/performance labels - Skip draft PRs to save costs Documentation: - .github/SETUP_SUMMARY.md - Quick setup overview - .github/QUICKSTART.md - 15-minute setup guide - .github/PRE_COMMIT_CHECKLIST.md - Verification checklist - .github/docs/ - Detailed guides for sync, AI review, Bedrock See .github/README.md for complete overview

Phase 3: Windows Dependency Build System - Implement full build workflow (OpenSSL, zlib, libxml2) - Smart caching by version hash (80% cost reduction) - Dependency bundling with manifest generation - Weekly auto-refresh + manual triggers - PowerShell download helper script - Comprehensive usage documentation Sync Workflow Fix: - Allow .github/ commits (CI/CD config) on master - Detect and reject code commits outside .github/ - Merge upstream while preserving .github/ changes - Create issues only for actual pristine violations Documentation: - Complete Windows build usage guide - Update all status docs to 100% complete - Phase 3 completion summary All three CI/CD phases complete (100%): ✅ Hourly upstream sync with .github/ preservation ✅ AI-powered PR reviews via Bedrock Claude 4.5 ✅ Windows dependency builds with smart caching Cost: $40-60/month total See .github/PHASE3_COMPLETE.md for details

The sync workflow was failing because the 'dev setup v19' commit modifies files outside .github/. Updated workflows to recognize commits with messages starting with 'dev setup' as allowed on master. Changes: - Detect 'dev setup' commits by message pattern (case-insensitive) - Allow merge if commits are .github/ OR dev setup OR both - Update merge messages to reflect preserved changes - Document pristine master policy with examples This allows personal development environment commits (IDE configs, debugging tools, shell aliases, Nix configs, etc.) on master without violating the pristine mirror policy. Future dev environment updates should start with 'dev setup' in the commit message to be automatically recognized and preserved. See .github/docs/pristine-master-policy.md for complete policy See .github/DEV_SETUP_FIX.md for fix summary

Up until now, the only way for a loadable module to disable the use of a particular index was to use build_simple_rel_hook (or, previous to yesterday's commit, get_relation_info_hook) to remove it from the index list. While that works, it has some disadvantages. First, the index becomes invisible for all purposes, and can no longer be used for optimizations such as self-join elimination or left join removal, which can severely degrade the resulting plan. Second, if the module attempts to compel the use of a certain index by removing all other indexes from the index list and disabling other scan types, but the planner is unable to use the chosen index for some reason, it will fall back to a sequential scan, because that is only disabled, whereas the other indexes are, from the planner's point of view, completely gone. While this situation ideally shouldn't occur, it's hard for a loadable module to be completely sure whether the planner will view a certain index as usable for a certain query. If it isn't, it may be better to fall back to a scan using a disabled index rather than falling back to an also-disabled sequential scan. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA%2BTgmoYS4ZCVAF2jTce%3DbMP0Oq_db_srocR4cZyO0OBp9oUoGg%40mail.gmail.com

A list of expressions with optional AS-labels is useful in a few different places. Right now, this is available as xml_attribute_list because it was first used in the XMLATTRIBUTES construct, but it is already used elsewhere, and there are other possible future uses. To reduce possible confusion going forward, rename it to labeled_expr_list (like existing expr_list plus ColLabel). Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org

Commit dae761a added initialization of some BrinBuildState fields in initialize_brin_buildstate(). Later, commit b437571 inadvertently added the same initialization again. This commit removes that redundant initialization. No behavioral change is intended. Author: Chao Li <lic@highgo.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAEoWx2nmrca6-9SNChDvRYD6+r==fs9qg5J93kahS7vpoq8QVg@mail.gmail.com

There's no need for a StringInfo when all you want is a string being constructed in a single pass. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Yang Yuanzhuo <1197620467@qq.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAEudQAq2wyXZRdsh+wVHcOrungPU+_aQeQU12wbcgrmE0bQovA@mail.gmail.com

Previously heap_inplace_update_and_unlock() used an operation order similar to MarkBufferDirty(), to reduce the number of different approaches used for updating buffers. However, in an upcoming patch, MarkBufferDirtyHint() will switch to using the update protocol used by most other places (enabled by hint bits only being set while holding a share-exclusive lock). Luckily it's pretty easy to adjust heap_inplace_update_and_unlock(). As a comment already foresaw, we can use the normal order, with the slight change of updating the buffer contents after WAL logging. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d

Add cost optimization to Windows dependency builds to avoid expensive builds when only pristine commits are pushed (dev setup commits or .github/ configuration changes). Changes: - Add check-changes job to detect pristine-only pushes - Skip Windows builds when all commits are dev setup or .github/ only - Add comprehensive cost optimization documentation - Update README with cost savings (~40% reduction) Expected savings: ~$3-5/month on Windows builds, ~$40-47/month total through combined optimizations. Manual dispatch and scheduled builds always run regardless.

This commit introduces test infrastructure for verifying Heap-Only Tuple (HOT) update functionality in PostgreSQL. It provides a baseline for demonstrating and validating HOT update behavior. Regression tests: - Basic HOT vs non-HOT update decisions - All-or-none property for multiple indexes - Partial indexes and predicate handling - BRIN (summarizing) indexes allowing HOT updates - TOAST column handling with HOT - Unique constraints behavior - Multi-column indexes - Partitioned table HOT updates Isolation tests: - HOT chain formation and maintenance - Concurrent HOT update scenarios - Index scan behavior with HOT chains

ExecGetAllUpdatedCols() misses attributes modified using heap_modify_tuple() that are not explictly SET in the UPDATE or by triggers. This happens in one test (tsearch.sql) when the tsvector_update_trigger() is invoked and modifies an indexed attribute that isn't referenced in any SQL. The net is that the functions like HeapDetermineColumnsInfo() have to scan all indexed attributes for changes rather than being able to first reduce the indexed set by intersecting it with the set of attributes known to be potentially updated. While this isn't so bad, it is an oversight should someone in the future build some security related feature using that incomplete result. It also might save a fraction of overhead calculating modified index attributes in heap_update(). This commit adds to ExecBRUpdateTriggers() code that identify changes to indexed columns not found by ExecGetAllUpdatedCols() and adds those attributes to ri_extraUpdatedCols. This commit introduces ExecCompareSlotAttrs() as a utility function to identify those attributes that have changed. It compares a subset of attributes between two TupleTableSlots and returns a Bitmapset of attributes that differ. It would be nice to integrate this into HeapDetermineColumnsInfo(), however it would be a layering violation given that it is within heap_update().

Refactor executor update logic to determine which indexed columns have actually changed during an UPDATE operation rather than leaving this up to HeapDetermineColumnsInfo() in heap_update(). Finding this set of attributes is not heap-specific, but more general to all table AMs and having this information in the executor could inform other decisions about when index inserts are required and when they are not regardless of the table AM's MVCC implementation strategy. The heap-only tuple decision (HOT) in heap functions as it always has, but the determination of the "modified indexed attributes" (modified_idx_attrs, formerly known as modified_attrs). ExecUpdateModifiedIdxAttrs() replaces HeapDetermineColumnsInfo() and is called before table_tuple_update() crucially without the need for an exclusive buffer lock on the page that holds the tuple being updated. This reduces the time the buffer lock is held later within heapam_tuple_update() and heap_update(). ExecUpdateModifiedIdxAttrs() uses the previously-introduced ExecCompareSlotAttrs() function to identify which attributes have changed and then intersects that with the set of indexed attributes to identify the modified indexed set, the modified_idx_attrs. Besides identifying the set of modified indexed attributes HeapDetermineColumnsInfo() was also responsible for part of the logic involved in the decision about what to WAL log for the replica identity key. This logic moved into heap_update() and out of the replacement named HeapUpdateModifiedIdxAttrs(). Doing this allows for simple_heap_update() and heapam_tuple_update() to share the same logic as they both call into heap_update(). Updates stemming from logical replication also use the new ExecUpdateModifiedIdxAttrs() in ExecSimpleRelationUpdate(). This patch introduces a few helper functions to reduce code duplication and increase readability: HeapUpdateHotAllowable(), HeapUpdateDetermineLockmode(). These are used in both heap_update() and simple_heap_update(). The heap_update() function is called now with lockmode pre-determined and a boolean indicating if the update allows HOT updates or not, both const. If during heap_update() the new tuple will fit on the same page and that boolean is true, the update is HOT. This means that although the functions and timing of the code involed in HOT decisions have changed, none of the logic related to when HOT is allowed has changed. Development of this feature exposed nondeterministic behavior in three existing tests which have been adjusted to avoid inconsistent test results due to tuple ordering during heap page scans.

This commit introduces the infrastructure for tracking modifications to sub-attributes (portions of columns used when forming index datum) during UPDATE operations, laying the groundwork for more efficient HOT (Heap-Only Tuple) updates with expression indexes, XML, and more. Core Infrastructure: * New catalog columns pg_type.{typidxextract, typidxcompare} to register type-specific subpath extraction and comparison functions. * New catalog column pg_proc.prosubattrmutator to mark mutation functions that perform incremental tracking via slot_add_modified_idx_attr(). * SubpathTrackingContext: Context passed to mutation functions enabling them to report which sub-attributes they modified. * execMutation.c: Core tracking functions including slot_add_modified_idx_attr() and HeapCheckSubpathChanges() for fallback comparison. * idxsubpath.c: Relcache integration to build and cache per-relation subpath metadata for expression indexes. * ExecUpdateModifiedIdxAttrs(): Executor function to identify which indexed attributes were actually modified, considering both whole-column changes and sub-attribute modifications. Memory Management: * TupleTableSlot.tts_modified_idx_attrs: Accumulates modified indexed attributes during expression evaluation. * ResultRelInfo.ri_InstrumentedIdxAttrs: Tracks which expression indexes have fully instrumented mutation tracking. Configuration: * enable_subpath_hot GUC: Controls whether sub-attribute tracking is active. Defaults to on. No types utilize this infrastructure yet. Subsequent commits will add JSONB and XML implementations that register their type-specific comparison functions and mark their mutation functions as prosubattrmutator. It is hoped that this approach will enable a dramatic performance improvement for structured types: when only a portion of an attribute changes (a "sub-attribute", such as modifying a single JSONB field), and that portion isn't used when forming index datum, the UPDATE can use HOT even though the column's bytes changed. Bump catalog version.

This commit enables efficient HOT updates for JSONB columns with expression indexes by implementing sub-attribute modification tracking for the JSONB type. JSONB Implementation: * jsonb_idx_extract(): Extracts indexed subpath descriptors from JSONB expression index definitions. Called at relcache build time to identify which JSON paths are indexed. * jsonb_idx_compare(): Compares old and new JSONB values at specific indexed subpaths, returning true if any indexed path changed. Used as fallback when instrumented tracking is unavailable. * Instrumented JSONB mutation functions: jsonb_set, jsonb_delete, jsonb_delete_path, jsonb_insert, jsonb_set_lax now call slot_add_modified_idx_attr() when provided a SubpathTrackingContext, enabling the executor to precisely track which indexed subpaths were modified without re-comparing the full JSONB value. Catalog Changes: * Register jsonb_idx_extract and jsonb_idx_compare in pg_proc.dat * Connect them to the jsonb type via typidxextract and typidxcompare in pg_type.dat * Mark JSONB mutation functions with prosubattrmutator = true Performance Impact: For JSONB workloads with expression indexes, this enables dramatic speedups: - Updating non-indexed JSONB fields: 9-126× faster (avoids index updates) - Large documents: Greater improvement (avoids full-value comparison) Example: CREATE INDEX idx ON t((data->'status')); UPDATE t SET data = jsonb_set(data, '{count}', '42'); -- Before: Non-HOT (reindexes even though 'status' unchanged) -- After: HOT (knows 'status' path wasn't modified) Tests: * Comprehensive JSONB HOT update tests covering: - Direct jsonb_set usage - Multiple expression indexes - Nested paths - NULL handling - Mixed expression + regular indexes - Concurrent CREATE INDEX (isolation test)

This commit extends sub-attribute modification tracking to the XML type, enabling efficient HOT updates for XML columns with XPath expression indexes. XML Implementation: * xml_idx_extract(): Extracts indexed XPath descriptors from XML expression index definitions. Identifies which XPath expressions are indexed on a relation. * xml_idx_compare(): Compares old and new XML values at specific indexed XPath expressions, returning true if any indexed path changed. Used as fallback when instrumented tracking is unavailable. * Instrumented XML functions: xpath() now calls slot_add_modified_idx_attr() when provided a SubpathTrackingContext, enabling the executor to precisely track which indexed XPaths were evaluated. Catalog Changes: * Register xml_idx_extract and xml_idx_compare in pg_proc.dat * Connect them to the xml type via typidxextract and typidxcompare in pg_type.dat Example: CREATE INDEX idx ON t((xpath('/doc/status', data))); UPDATE t SET data = xpath_set(data, '/doc/count', '42'); -- Before: Non-HOT (reindexes even though '/doc/status' unchanged) -- After: HOT (knows '/doc/status' path wasn't modified) This implementation follows the same architecture as JSONB, providing both instrumented (fast path) and comparison-based (fallback) tracking for XML expression indexes.

gburd force-pushed the cf-5556 branch 7 times, most recently from e777a6e to 650f621 Compare November 1, 2025 17:22

gburd force-pushed the cf-5556 branch 6 times, most recently from 16e0007 to 331cd76 Compare November 7, 2025 20:56

gburd force-pushed the cf-5556 branch 4 times, most recently from 9558f42 to 05c4e60 Compare November 16, 2025 18:53

gburd force-pushed the cf-5556 branch 4 times, most recently from ae8af13 to 9f584af Compare November 19, 2025 18:18

gburd force-pushed the cf-5556 branch 2 times, most recently from b142c27 to 94e88c7 Compare December 1, 2025 18:10

gburd force-pushed the cf-5556 branch 2 times, most recently from 0d87b2f to d4f607c Compare March 10, 2026 18:18

gburd and others added 5 commits March 10, 2026 14:25

MasaoFujii and others added 4 commits March 10, 2026 14:25

gburd force-pushed the master branch from 97df741 to 606c93b Compare March 10, 2026 18:26

gburd added 7 commits March 10, 2026 14:27

dev setup v25

d3f4e3c

gburd force-pushed the cf-5556 branch from d4f607c to feca6b4 Compare March 10, 2026 18:27

gburd force-pushed the master branch 3 times, most recently from 571be3b to 03facc1 Compare March 10, 2026 18:37

github-actions bot force-pushed the master branch from d20ffd8 to d39c613 Compare March 11, 2026 17:07

gburd force-pushed the master branch 4 times, most recently from eb62dd0 to 05b764b Compare March 11, 2026 19:11

github-actions bot force-pushed the master branch from 05b764b to 3e5da77 Compare March 11, 2026 19:17

gburd force-pushed the master branch from 3e5da77 to 62bf60c Compare March 11, 2026 19:24

github-actions bot force-pushed the master branch 3 times, most recently from 9177b8a to 4cefb52 Compare March 12, 2026 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cf 5556#16

cf 5556#16
gburd wants to merge 16 commits intomasterfrom
cf-5556

gburd commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

gburd commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants