Skip to content

fix: Flush pool as late as possible during schema cache reloading#4645

Merged
steve-chavez merged 2 commits intoPostgREST:mainfrom
mkleczek:flush-pool-after-schema-cache-load
Apr 14, 2026
Merged

fix: Flush pool as late as possible during schema cache reloading#4645
steve-chavez merged 2 commits intoPostgREST:mainfrom
mkleczek:flush-pool-after-schema-cache-load

Conversation

@mkleczek
Copy link
Copy Markdown
Collaborator

@mkleczek mkleczek commented Feb 11, 2026

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

retryingSchemaCacheLoad flushes the pool upon every retry before it starts reloading the schema. This is too early as schema reloading might take some time during which new connections might be acquired. The consequence is that:

  • upon successful schema cache reload we might have some connections created with the old schema cache
  • we close connections upon each retry and under load we will keep closing and re-opening connections until schema cache load succeeds

This change is to make sure we flush the pool only after successful schema cache querying but before loading (so that connections acquired during loading wait for it and do not interfere with timing the loading process).

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 3 times, most recently from d5a6f71 to 7221241 Compare February 11, 2026 19:50
@steve-chavez
Copy link
Copy Markdown
Member

The change seems reasonable. Is there a way to test the previous behavior and that this is an improvement?

I'm thinking we have the schemaCacheLoads metric but perhaps we should also have one metric for pool flushes? Then we might be able to test that this change reduces the flushes.

data MetricsState =
MetricsState {
poolTimeouts :: Counter,
poolAvailable :: Gauge,
poolWaiting :: Gauge,
poolMaxSize :: Gauge,
schemaCacheLoads :: Vector Label1 Counter,
schemaCacheQueryTime :: Gauge,
jwtCacheRequests :: Counter,
jwtCacheHits :: Counter,
jwtCacheEvictions :: Counter
}


Related, just noted these two functions do the same perhaps we should deduplicate:

-- | Flush the connection pool so that any future use of the pool will
-- use connections freshly established after this call.
flushPool :: AppState -> IO ()
flushPool AppState{..} = SQL.release statePool
-- | Destroy the pool on shutdown.
destroyPool :: AppState -> IO ()
destroyPool AppState{..} = SQL.release statePool

@mkleczek
Copy link
Copy Markdown
Collaborator Author

The change seems reasonable. Is there a way to test the previous behavior and that this is an improvement?

I'm thinking we have the schemaCacheLoads metric but perhaps we should also have one metric for pool flushes? Then we might be able to test that this change reduces the flushes.

Done:

  • Added pgrst_db_pool_flushes_total metric.
  • Added HSpec module dedicated to metrics tests and a test to verify the new metric (I think it is a good idea to have one - we already have JWT cache spec tests that make use of metric state)
  • Added io tests to validate pool flushing behavior (both in normal conditions and with schema loading retries)

Related, just noted these two functions do the same perhaps we should deduplicate:

After latest changes they are no longer duplicate as destroyPool does not emit observations.

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 2 times, most recently from a5f9ece to f9ea218 Compare February 15, 2026 21:13
@steve-chavez
Copy link
Copy Markdown
Member

steve-chavez commented Feb 17, 2026

Added pgrst_db_pool_flushes_total metric.
Added HSpec module dedicated to metrics tests and a test to verify the new metric (I think it is a good idea to have one - we already have JWT cache spec tests that make use of metric state)
Added io tests to validate pool flushing behavior (both in normal conditions and with schema loading retries)

@mkleczek Could you split those commits into another PR? 🙏 (without the fix here) That would make it easier to review.

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch from f9ea218 to 24e71b5 Compare February 17, 2026 13:14
@mkleczek
Copy link
Copy Markdown
Collaborator Author

mkleczek commented Feb 17, 2026

Added pgrst_db_pool_flushes_total metric.
Added HSpec module dedicated to metrics tests and a test to verify the new metric (I think it is a good idea to have one - we already have JWT cache spec tests that make use of metric state)
Added io tests to validate pool flushing behavior (both in normal conditions and with schema loading retries)

@mkleczek Could you split those commits into another PR? 🙏 (without the fix here) That would make it easier to review.

@steve-chavez Done. Stacked this PR on top of #4658

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 3 times, most recently from b428649 to 3c033ab Compare February 17, 2026 20:28
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 2 times, most recently from e601d59 to 16aedf7 Compare February 25, 2026 05:51
@mkleczek
Copy link
Copy Markdown
Collaborator Author

@mkleczek Could you split those commits into another PR? 🙏 (without the fix here) That would make it easier to review.

@steve-chavez Done. Stacked this PR on top of #4658

@steve-chavez Rebased on #4668

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch from 16aedf7 to 0a348f7 Compare February 25, 2026 18:54
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 5 times, most recently from 2e23441 to 3e7308f Compare March 6, 2026 10:30
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 4 times, most recently from e82492b to ba16275 Compare March 13, 2026 06:49
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 2 times, most recently from 39e391c to ad7ecd4 Compare March 31, 2026 06:55
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 8 times, most recently from dc113cf to b24d652 Compare April 9, 2026 12:47
@steve-chavez
Copy link
Copy Markdown
Member

Needs rebasing now

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch from b24d652 to 9130429 Compare April 9, 2026 16:36
@taimoorzaeem
Copy link
Copy Markdown
Member

Is this a bug fix? If so, then it may also need a CHANGELOG.md entry.

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch 2 times, most recently from ec7d1b6 to 0367a9c Compare April 14, 2026 07:02
@steve-chavez
Copy link
Copy Markdown
Member

@mkleczek Could you please address the feedback here?

@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch from 0367a9c to a836402 Compare April 14, 2026 20:02
retryingSchemaCacheLoad flushes the pool upon every retry before it starts reloading the schema. This is too early as schema reloading might take some time during which new connections might be acquired. The consequence is that:
* upon successful schema cache reload we might have some connections created with the old schema cache
* we close connections upon each retry and under load we will keep closing and re-opening connections until schema cache load succeeds

This change is to make sure we flush the pool only after successful schema cache querying but before loading (so that connections acquired during loading wait for it and do not interfere with timing the loading process).
@mkleczek mkleczek force-pushed the flush-pool-after-schema-cache-load branch from a836402 to 9271907 Compare April 14, 2026 20:02
@mkleczek
Copy link
Copy Markdown
Collaborator Author

@mkleczek Could you please address the feedback here?

Done

@mkleczek mkleczek requested a review from taimoorzaeem April 14, 2026 20:03
Comment thread src/PostgREST/AppState.hs
Comment thread CHANGELOG.md Outdated


it "Should flush pool multiple times when schema reloading retries" $ do
it "Should flush pool once when schema reloading retries" $ do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, clearly proves the fix 👍

Co-authored-by: Steve Chavez <stevechavezast@gmail.com>
@steve-chavez steve-chavez merged commit aca58c8 into PostgREST:main Apr 14, 2026
32 checks passed
taimoorzaeem pushed a commit to taimoorzaeem/postgrest that referenced this pull request Apr 15, 2026
…stgREST#4645)

retryingSchemaCacheLoad flushes the pool upon every retry before it starts reloading the schema. This is too early as schema reloading might take some time during which new connections might be acquired. The consequence is that:
* upon successful schema cache reload we might have some connections created with the old schema cache
* we close connections upon each retry and under load we will keep closing and re-opening connections until schema cache load succeeds

This change is to make sure we flush the pool only after successful schema cache querying but before loading (so that connections acquired during loading wait for it and do not interfere with timing the loading process).
@taimoorzaeem
Copy link
Copy Markdown
Member

taimoorzaeem commented Apr 15, 2026

If we decide to backport this, we'd need to first backport 5d4f82d from #4668 (already applied ad90721).

@steve-chavez
Copy link
Copy Markdown
Member

@taimoorzaeem Yes, those should be fine to backport. This one too.

taimoorzaeem pushed a commit to taimoorzaeem/postgrest that referenced this pull request Apr 16, 2026
…stgREST#4645)

retryingSchemaCacheLoad flushes the pool upon every retry before it starts reloading the schema. This is too early as schema reloading might take some time during which new connections might be acquired. The consequence is that:
* upon successful schema cache reload we might have some connections created with the old schema cache
* we close connections upon each retry and under load we will keep closing and re-opening connections until schema cache load succeeds

This change is to make sure we flush the pool only after successful schema cache querying but before loading (so that connections acquired during loading wait for it and do not interfere with timing the loading process).
taimoorzaeem pushed a commit that referenced this pull request Apr 16, 2026
)

retryingSchemaCacheLoad flushes the pool upon every retry before it starts reloading the schema. This is too early as schema reloading might take some time during which new connections might be acquired. The consequence is that:
* upon successful schema cache reload we might have some connections created with the old schema cache
* we close connections upon each retry and under load we will keep closing and re-opening connections until schema cache load succeeds

This change is to make sure we flush the pool only after successful schema cache querying but before loading (so that connections acquired during loading wait for it and do not interfere with timing the loading process).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants