Skip to content

CBG-5436: generate HLV.version from our own HLC for SGW actor writes#8355

Open
gregns1 wants to merge 6 commits into
mainfrom
CBG-5436
Open

CBG-5436: generate HLV.version from our own HLC for SGW actor writes#8355
gregns1 wants to merge 6 commits into
mainfrom
CBG-5436

Conversation

@gregns1

@gregns1 gregns1 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

CBG-5436

Generate the HLV current version from a new SG-owned Hybrid Logical Clock before the write, so the value is known up front and written as a literal everywhere it's needed (cv, _sync.rev, and each channel removal). This removes the per-channel macro paths entirely, making subdoc path count independent of how many channels are removed. cvCAS stays macro-expanded to the committed CAS.

  • base/hlc.go — new HybridLogicalClock (Now(floor) generates a monotonic, CAS-comparable value: 48-bit ns + 16-bit logical) — one per DatabaseContext.
  • updateHLV / resolveDocMergeHLV — both now generate the version via the HLC; the version is generated early in documentUpdateFunc (before the sync function) to widen the gap to the commit CAS.
  • Removed the now-dead channel-removal macro-expansion machinery (appendRevocationMacroExpansions, the per-channel xattr-path helpers, the revokedChannelsRequiringExpansion plumbing).

cv.ver ≤ cas invariant (goxdcr)

A generated version must not exceed the document CAS (goxdcr rejects cv.ver > cas). The version is generated as early as possible so the later server CAS almost always exceeds it. As a safety net, correctVersionAheadOfCAS detects cv.ver > cas for SG-generated versions, waits for the clock to catch up (bounded at 1s), and re-stamps the CAS while preserving the version — so the version is never moved backwards. Gaps over 1s are logged and left for a later mutation. This will also only run for our sourceID (writes coming in from clients will not have this correction behaviour).

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

  • Link upstream PRs
  • Update Go module dependencies when merged

Integration Tests

Copilot AI review requested due to automatic review settings June 11, 2026 16:29

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes how Sync Gateway generates and persists HLV “current version” values for SG-owned actor writes: it introduces an SG-owned Hybrid Logical Clock (HLC) to generate a literal cv.ver up front (instead of relying on CAS macro expansion), removes per-channel revocation macro-expansion paths, and adds a bounded corrective re-stamp for the rare cv.ver > cas case to satisfy XDCR constraints.

Changes:

  • Add base.HybridLogicalClock and wire a per-DatabaseContext clock to generate HLV version values ahead of the write CAS.
  • Update HLV write/merge logic to use generated literal versions (leaving cvCAS macro-expanded) and remove channel-removal macro-expansion plumbing.
  • Expand/adjust unit and integration tests for the new version semantics and add a new internal stat for corrective CAS re-stamps.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
base/hlc.go Introduces the SG-owned Hybrid Logical Clock implementation used to generate CAS-comparable version values.
base/hlc_test.go Adds unit tests validating monotonicity, wall-clock tracking, floor behavior, and concurrency safety of the HLC.
base/stats.go Adds hlv_version_cas_retry_count stat for tracking corrective re-stamps.
base/stats_descriptions.go Adds description text for the new HLV/CAS correction stat.
db/database.go Stores a per-DatabaseContext HLC instance (hlc) initialized at context creation time.
db/crud.go Generates literal HLV versions via HLC, updates merge resolution to use HLC, removes revocation macro expansion, and adds CAS re-stamp correction logic.
db/document.go Simplifies updateChannels signature by removing revoked-channel macro-expansion return values.
db/hybrid_logical_vector.go Adds maxValueForSource helper and removes version macro-expansion behavior from computeMacroExpansions.
db/utilities_hlv_testing.go Updates test helper macro-expansion behavior to model “external peers” explicitly; adjusts AlterHLVForTest; adds a DB helper to fetch HLC values in tests.
db/hybrid_logical_vector_test.go Adds tests for CAS-ahead correction behavior and maxValueForSource.
db/revision_cache_test.go Updates call site for changed updateChannels signature and aligns CV expectations with HLV version.
db/database_test.go Updates resync expectations to compare against HLV version instead of CAS.
db/crud_test.go Updates tests to stop assuming HLV version equals document CAS; removes now-dead subdoc-path tests.
db/changes_test.go Updates CV population assertions to use HLV version semantics.
rest/api_test.go Updates REST API tests to assert cv.ver <= cas and adds coverage for many channel removals.
rest/changes_test.go Updates change feed CV assertions to match HLV version rather than CAS.
rest/blip_api_crud_test.go Adds BLIP test ensuring large channel-removal updates persist correct removal metadata without per-channel macro-expansion paths.
rest/replicatortest/replicator_test.go Updates replicator conflict tests to expect non-CAS HLV version semantics.
rest/replicatortest/replicator_conflict_test.go Updates conflict-setup helpers to generate explicit HLC-backed version values rather than CAS macro placeholders.

Comment thread rest/api_test.go
Comment thread db/crud.go
@adamcfraser adamcfraser changed the title CBG-5436: generate HLV.version form our own hlc for SGW actor writes CBG-5436: generate HLV.version from our own HLC for SGW actor writes Jun 12, 2026
Comment thread db/crud.go Outdated
if err != nil {
if base.IsCasMismatch(err) {
// A concurrent writer beat us to it; their mutation carries a later CAS, so the version is no
// longer ahead, and it's that writer's responsibility to satisfy the invariant.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessarily the case that their mutation with a later CAS is greater than the version, if it occurred quickly enough after our write? Might want to do a comparison and modify the log message accordingly.

May also want to include cas2 in the log message if it's returned on Cas mismatch error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified the comment and log message, we don't return the cas on failure so haven't included that in the log message

Comment thread db/crud.go
}
return db.dataStore.UpdateXattrs(ctx, key, 0, cas, map[string][]byte{
base.SyncXattrName: syncXattr,
base.VvXattrName: vvXattr,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we populate _mou here, to avoid having eventing functions fire again on the restamped version? Or does this result in undesirable XDCR behaviour?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought we didn't want mou populated, but I think after thinking about it more we do. I'm going to add that now

Comment thread db/crud.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants