Skip to content

CBG-5444 Close terminator instead of stopProcess when polling detects StatusNotRunning#8353

Open
adamcfraser wants to merge 2 commits into
mainfrom
CBG-5444
Open

CBG-5444 Close terminator instead of stopProcess when polling detects StatusNotRunning#8353
adamcfraser wants to merge 2 commits into
mainfrom
CBG-5444

Conversation

@adamcfraser

@adamcfraser adamcfraser commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

CBG-5444

Avoids having nodes overwrite completed status with stopped if they are not the first to persist the completed status state.

… StatusNotRunning

Avoids having other nodes overwrite completed status with stopped if they are not the first to update the status to completed.
Copilot AI review requested due to automatic review settings June 11, 2026 04:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts multi-node background manager behavior so that when the polling loop detects the cluster status is no longer running, it terminates local execution without calling stopProcess, to avoid persisting a “stopping/stopped” state over a terminal cluster state (e.g., completed). It also adds explanatory notes to the long-running distributed resync dev-time test about cbgt feed/dest behavior when a second node comes online.

Changes:

  • In multi-node status polling, close the terminator directly instead of calling stopProcess when the cluster status indicates the process is not running.
  • Add detailed commentary to TestDistributedResync describing cbgt feed rebalance/destKey behavior and its effect on what the test is actually validating.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
db/background_mgr.go Changes the multi-node polling termination behavior when the cluster status doc indicates a terminal state.
rest/adminapitest/resync_test.go Adds clarifying comments about cbgt behavior impacting distributed resync dev-time testing.

Comment thread db/background_mgr.go
Comment on lines 805 to 809
err := b.updateMultiNodeClusterAwareStatus(ctx, backgroundManagerStatusUpdate)
if err != nil {
if errors.Is(err, errBackgroundManagerStatusNotRunning) {
b.stopProcess(ctx)
b.terminator.Close()
return
Comment thread db/background_mgr.go
Comment on lines 806 to 809
if err != nil {
if errors.Is(err, errBackgroundManagerStatusNotRunning) {
b.stopProcess(ctx)
b.terminator.Close()
return

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is correct - I think what has to happen is that we should transition the local status to match whatever the cluster status was - and then call terminate()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is safe because of this block: https://github.com/couchbase/sync_gateway/blob/main/db/background_mgr.go#L723-L733

In the case of the persistClusterStatusCallback, this is defined as b.UpdateStatusClusterAware which uses UpdateStatusClusterAware which uses backgroundManagerStatusUpdate.

In the case of the transition at the end of the process, b.UpdateStatusClusterAware is also called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants