fix: decouple connection retry backoff from TCP dial timeout #1387
+114
−59
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
KEDA_HTTP_CONNECT_TIMEOUT was incorrectly used as both the TCP dial timeout and the initial retry backoff duration. This caused cold start response times to scale linearly with the timeout value as the first connection attempt by the interceptor usually failed (as it is a cold start).
Changes
Before and After
I generated a small script to simulate the before and after sleep times, as you can see it is now way more reasonable.
We now retry every second instead of every 16 seconds after 5 failed connection attempts or instead of 1min+ when using a timeout of 20 seconds.
Before with KEDA_HTTP_CONNECT_TIMEOUT=500ms (default)
After (independent of KEDA_HTTP_CONNECT_TIMEOUT)
More Results
Before with KEDA_HTTP_CONNECT_TIMEOUT=5s
Before with KEDA_HTTP_CONNECT_TIMEOUT=20s
Code
Checklist
README.mddocs/directoryFixes #1385