Skip to content

Implement safety-gap pattern for rollback safety by decoupling image update from schema update #271

@WentingWu666666

Description

@WentingWu666666

Summary

Decouple the DocumentDB extension image update from the schema update (ALTER EXTENSION documentdb UPDATE) to enable safe rollback. The binary (image) should always be one version ahead of the schema, following the pgmongo safety-gap pattern.

Details

Background:

  • pgmongo follows the safety-gap pattern: binary is always one version ahead of schema version. This ensures rollback safety because the older binary can still read the current schema.
  • Current PR Add DocumentDB upgrade support with configurable PostgresImage and ImageVolume extensions #208 couples image update with schema update: when the extension image is updated, ALTER EXTENSION documentdb UPDATE runs automatically in the same reconciliation cycle.
  • This is not safe for rollback: if a user rolls back to an older image, the schema may already be at the newer version, causing incompatibility.

Problem with Current Approach:

Upgrade: Image 0.109 → 0.110, then ALTER EXTENSION → Schema 0.110
Rollback: Image 0.110 → 0.109... but Schema is still 0.110!
Result: Old binary cannot read new schema → FAILURE

Safety-Gap Pattern:

Upgrade Phase 1: Image 0.109 → 0.110 (schema stays at 0.109)
  - Binary 0.110 can read schema 0.109 ✓
  - Rollback to 0.109 is safe (schema still 0.109) ✓
  
Upgrade Phase 2: ALTER EXTENSION → Schema 0.110 (user-triggered or delayed)
  - Binary 0.110 reads schema 0.110 ✓
  - Rollback now blocked (schema already updated)

Implementation Approach:

  1. Decouple image update from schema update:

    • Image update happens when spec.documentDBVersion or spec.documentDBImage changes
    • Schema update (ALTER EXTENSION) is triggered separately
  2. Add spec.schemaVersion field (optional):

    spec:
      documentDBVersion: "0.110.0"  # Controls image version
      schemaVersion: "0.110.0"      # Controls schema version (must be <= documentDBVersion)
  3. Alternative: Auto-delay schema update:

    • Update image immediately
    • Delay ALTER EXTENSION by N minutes/hours or until explicit trigger
    • Add status condition showing "schema update pending"
  4. Block rollback when schema is ahead:

    • If status.schemaVersion > requested imageVersion, block the change
    • Require backup or explicit override annotation

Status Fields:

status:
  documentDBVersion: "0.110.0"   # Current image version
  schemaVersion: "0.109.0"       # Current schema version (may lag behind)
  schemaUpdatePending: true      # Schema can be updated to match image

Files to Modify:

  • api/preview/documentdb_types.go - Add SchemaVersion to spec and status
  • internal/controller/documentdb_controller.go - Decouple upgrade logic
  • internal/controller/upgrade_logic.go - Separate image vs schema update paths
  • CRD manifests - Add new fields

Acceptance Criteria:

  • Image update and schema update are separate operations
  • Binary version is always >= schema version
  • Rollback is safe when schema hasn't been updated yet
  • Rollback is blocked when schema is ahead of target image version
  • Status shows both image version and schema version
  • User can trigger schema update explicitly (or auto-delay with configurable time)
  • E2E tests verify rollback safety

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions