add: use SO_REUSEPORT on platform supporting it#4703
add: use SO_REUSEPORT on platform supporting it#4703mkleczek wants to merge 2 commits intoPostgREST:mainfrom
Conversation
c252511 to
27c16d7
Compare
|
This is failing all tests, I'd suggest to put these type of PRs as draft. |
Hmm... worked before latest push. Will switch to draft and fix. |
@steve-chavez - it looks like the issue is that on my machine PostgREST startup is fast enough so that it loads the schema cache before it accepts any requests. Here in CI the new instance fails with The question: is there any particular reason why we return |
We were aiming to have requests wait instead of 503, this waiting does happen during schema cache reload but not on startup; we discussed this on #4129. Would it be better to not listen on the socket? How would clients behave in this case? |
|
They would get a "connection refused" error, which means nobody there and is imo more confusing that any 5xx error. UX-wise I would prefer some waiting to a presumably hard fail any day. |
I am not convinced, see below. This is a complex topic so let's dig into it a little more. The startup sequence right now is:
The alternatives are:
b - listening on a socket only after schema cache loaded
So from the point of view of the clients (they don't know when Postgrest was started), we have 3 alternatives:
I am not sure what value clients get from the first two options comparing to the third one. Diagnostics and readiness checks should be done using In case of
So the first two options cause disruptions whereas the third one is fully zero-downtime and transparent to the clients. My take on it would be:
This would require splitting binding from listening on the main socket (ie. we need to bind without listening first so that we can pass the socket to the admin server). @steve-chavez @develop7 thoughts? |
|
Dependent on resolving (or having a workaround to) yesodweb/wai#853 |
64bb6ca to
2cb6503
Compare
Agree, sounds much better. |
7d7375a to
bd4c7ec
Compare
9ebfb7f to
3f522cc
Compare
| - Fix login with uppercase and mixed case role names by @taimoorzaeem in #4678 | ||
| - Remove automatic transaction retries on `40001 (serialization_failure)` errors to prevent replication lag by @laurenceisla in #3673 | ||
| - Remove automatic transaction retries on `40001 (serialization_failure)` errors to prevent replication lag by @laurence in #3673 | ||
| - Enable starting multiple PostgREST instances using the same ports on platforms supporting it by @mkleczek in #4703 #4694 |
There was a problem hiding this comment.
Oh, interesting, is that true? Can I start multiple instances and run them in parallel? I thought this was only possible in some kind of seamless transition way while starting one and stopping the other, but this reads as if I could run them in parallel indefinitely?
There was a problem hiding this comment.
Yes, that's correct. SO_REUSEPORT is a poor man's loadbalancer
There was a problem hiding this comment.
I guess we have to keep that in mind for odd test failures or bug reports. Where the user would get a "port already in use" error before, when they accidentally started multiple instances of postgrest, they will now get requests randomly served from different instances. This might be very confusing...
There was a problem hiding this comment.
Yeah… We might guard it with a config flag that’s off by default. But I’m not too keen on adding yet another config property.
Open to suggestions though.
There was a problem hiding this comment.
I’m not too keen on adding yet another config property.
same!
I don't have any better suggestions right now, I think reusing the port is a sane behavior for most actual deployments.
There was a problem hiding this comment.
Yes, I think so. If we ever wanted to reuse the admin server port, we should add admin-server-reuseport as a config option. server-xxx should not affect the admin server.
There was a problem hiding this comment.
@mkleczek Seeing the docs you added here (
"Admin ports are not shared between processes. This keeps readiness checks unambiguous"), would it ever make sense to have reuseport for the admin server?
IMHO - no.
That's a very common problem in clustered environments: it does not make sense to load balance metrics and health endpoints. SO_REUSEPORT is a form of load balancing and the same rule applies.
Wouldn't that invalidate zero-down time upgrades?
I am not sure I understand.
You can have zero downtime upgrades without admin server running at all. Just start the new instance, wait for it to log "schema cache loaded" (or just wait long enough, ie. 1 minute should be safe), then stop the first instance.
There was a problem hiding this comment.
Unresolving this conversation since I see the server-reuseport config hasn't been added yet. As agreed that should only affect the API server not the admin server.
There was a problem hiding this comment.
Implemented server-reuseport configuration property.
a420618 to
f61afb0
Compare
2b8643e to
6c64080
Compare
033bd60 to
127490e
Compare
|
Conflicted in the changelog. |
| mainSocketRef <- newIORef Nothing | ||
| adminSocket <- initAdminServerSocket conf | ||
|
|
||
| let closeSockets = do | ||
| whenJust adminSocket NS.close | ||
| NS.close mainSocket | ||
| readIORef mainSocketRef >>= foldMap NS.close | ||
| Unix.installSignalHandlers observer closeSockets (AppState.schemaCacheLoader appState) (AppState.readInDbConfig False appState) | ||
|
|
||
| Admin.runAdmin appState adminSocket (readIORef mainSocketRef) (serverSettings conf) | ||
|
|
||
| Listener.runListener appState | ||
|
|
||
| Admin.runAdmin appState adminSocket mainSocket (serverSettings conf) | ||
| -- Kick off and wait for the initial SchemaCache load before creating the | ||
| -- main API socket. | ||
| AppState.schemaCacheLoader appState | ||
| AppState.waitForSchemaCacheLoaded appState | ||
|
|
||
| mainSocket <- initServerSocket conf | ||
| atomicWriteIORef mainSocketRef $ Just mainSocket |
There was a problem hiding this comment.
@mkleczek Was the startup sequence change you mentioned on #4703 (comment), done here?
|
|
||
| Zero-Downtime Upgrades | ||
| ====================== | ||
|
|
There was a problem hiding this comment.
| :author: `mkleczek <https://github.com/mkleczek>`_ |
We've been doing this for almost all how-tos:
|
|
||
| The TCP port to bind the web server. Use ``0`` to automatically assign a port. | ||
|
|
||
| On operating systems that support ``SO_REUSEPORT``, you can start multiple |
There was a problem hiding this comment.
Let's put a heading and anchor here so we can link it from other places
| On operating systems that support ``SO_REUSEPORT``, you can start multiple | |
| .. _reuseport: | |
| SO_REUSEPORT | |
| ~~~~~~~~~~~~~ | |
| On operating systems that support ``SO_REUSEPORT``, you can start multiple |
0e7d4aa to
680c6e1
Compare
c12c6fd to
bec258c
Compare
This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.
DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.
Fixes #4694
Stacked on top of #4702 as it is not enough to start a new instance, it is also necessary not to fail in-flight requests on the old instance.