fix(sdk): fix race condition when checkpoint completion happens before waitForStatusChange is called #395
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
Description of changes:
There is a race condition in waitForStatusChange where if the operation completes before this is called, it will not get the status and just terminate without updating the data correctly. This results in the error
Cannot return PENDING status with no pending operations.To fix this, if the operation status is terminal in waitForStatusChange or waitForRetryTimer, we resolve instantly instead of creating a promise and waiting.
This can happen since waitForStatusChange is called asynchronously, so there can be time between when the phase 2 promise starts and when the last checkpoint data was updated. I disabled time skipping in wait-for-callback-serdes.test.ts which consistently shows this (when running locally). In the cloud tests it's harder to reproduce since we poll for the callback ID every second and we don't get the data instantly, but it could still happen in rare cases.
I also added this same safeguard for waitForRetryTimer. This could happen if the main thread is blocked for a long time somehow and it fails to call waitForRetryTimer in time.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.