Search: Use staging index + alias swap for content date lookup sync#3098
Search: Use staging index + alias swap for content date lookup sync#3098
Conversation
Replace the delete-then-reindex flow in ContentDateEnrichment with a staging index + atomic alias swap to eliminate the window where the lookup index is empty. If a reindex fails, the previous lookup data remains intact behind the alias. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe Sequence Diagram(s)sequenceDiagram
participant Enricher as ContentDateEnrichment
participant ES as Elasticsearch API
participant Source as Lexical Source Alias
Enricher->>ES: Resolve backing index for lookup alias (GET /_alias/...)
ES-->>Enricher: currentBackingIndex
Enricher->>ES: Create staging backing index (PUT /staging-<ts>)
ES-->>Enricher: created
Enricher->>ES: Reindex from Source alias to staging index (POST /_reindex)
ES-->>Enricher: reindex task complete
Enricher->>ES: Refresh staging index (POST /staging-<ts>/_refresh)
ES-->>Enricher: refreshed
Enricher->>ES: Swap alias to staging index (POST /_aliases add staging, remove old)
ES-->>Enricher: alias swapped
Enricher->>ES: Delete old backing index (DELETE /old-backing-index)
ES-->>Enricher: deleted (or warning on failure)
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs`:
- Around line 57-58: RefreshIndexAsync currently only logs warnings on non-2xx
responses so SwapAliasAsync runs even if the index refresh failed; change the
flow so a failed refresh prevents the alias swap: modify RefreshIndexAsync to
throw (or alternatively return a boolean success) when the refresh response is
not successful, then update all call sites (the calls that currently do await
RefreshIndexAsync(...); await SwapAliasAsync(...); — and the other occurrences
near the SwapAliasAsync/ExecutePolicyAsync sequence) to either await
RefreshIndexAsync and only call SwapAliasAsync on success or let the thrown
exception propagate to stop execution; reference RefreshIndexAsync,
SwapAliasAsync and ExecutePolicyAsync when making the change.
- Around line 68-69: Resolve the alias and bootstrap logic: in
ResolveBackingIndexAsync treat only a 404 from GET /_alias/... as "alias
missing" and for other HTTP/network errors rethrow or propagate them so
transient failures aren't treated as absent alias; if the alias response yields
multiple backing indices, fail deterministically instead of using
FirstOrDefault() so you don't pick nondeterministically. Make
GenerateStagingName collision-resistant by adding a high-entropy suffix (e.g.,
DateTime.UtcNow.Ticks + a short GUID) so concurrent initializers produce unique
staging names. In SyncLookupIndexAsync, change the RefreshIndexAsync call so
that refresh failures throw (or return a non-recoverable error) instead of only
logging warnings, preventing subsequent calls to SwapAliasAsync and
ExecutePolicyAsync when the refresh did not succeed. Ensure references to
_lookupAlias, GenerateStagingName, ResolveBackingIndexAsync,
SyncLookupIndexAsync, RefreshIndexAsync, SwapAliasAsync, and ExecutePolicyAsync
are updated accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4bf33a51-b531-4bd8-9968-799263886172
📒 Files selected for processing (1)
src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs
src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs
Outdated
Show resolved
Hide resolved
Make RefreshIndexAsync throw on failure so SwapAliasAsync never runs against an unrefreshed staging index. Distinguish 404 from transient errors in ResolveBackingIndexAsync and fail deterministically when the alias points at multiple indices. Add a GUID suffix to staging index names for collision resistance under concurrent runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs (1)
98-109:⚠️ Potential issue | 🔴 CriticalMake the alias swap single-owner under concurrent runs.
EnsureLookupIndexAsync()bootstraps withSwapAliasAsync(null, ...), which becomes an add-only alias update. Elastic’s alias docs explicitly allow one alias to point to multiple indices, say thataddcreates the alias if it does not exist, and note that multi-action/_aliasesrequests can still returnacknowledged: truewith per-action failures whenremovemisses. Two overlapping initialize/sync calls can therefore leave_lookupAliasattached to more than one backing index while this code logs success, after whichResolveBackingIndexAsync()starts throwing. Parseerrors/action_resultsand make the swap resilient to stale or nulloldIndex(for example, remove the alias from all current backings in the same request, or re-resolve and retry before adding). (elastic.co)Also applies to: 142-165
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs` around lines 98 - 109, EnsureLookupIndexAsync currently calls SwapAliasAsync(null, indexName, ct) which issues an add-only alias update and can leave _lookupAlias attached to multiple indices under concurrent runs; update the bootstrap flow to make the alias swap single-owner by first resolving current backings via ResolveBackingIndexAsync (or listing all indices with the alias), then perform a single multi-action aliases request that removes the alias from all current backing indices and adds it to GenerateStagingName() in the same request (or, if you prefer, loop: re-resolve and retry the remove+add if action_results/errors indicate partial failures), and ensure SwapAliasAsync parses the alias API response (errors/action_results) to detect per-action failures and surface/retry them instead of assuming success; also apply the same change to the other occurrence referenced around SwapAliasAsync usage (lines ~142-165) and to CreateLookupIndexAsync as needed to coordinate the atomic remove+add behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs`:
- Around line 98-109: EnsureLookupIndexAsync currently calls
SwapAliasAsync(null, indexName, ct) which issues an add-only alias update and
can leave _lookupAlias attached to multiple indices under concurrent runs;
update the bootstrap flow to make the alias swap single-owner by first resolving
current backings via ResolveBackingIndexAsync (or listing all indices with the
alias), then perform a single multi-action aliases request that removes the
alias from all current backing indices and adds it to GenerateStagingName() in
the same request (or, if you prefer, loop: re-resolve and retry the remove+add
if action_results/errors indicate partial failures), and ensure SwapAliasAsync
parses the alias API response (errors/action_results) to detect per-action
failures and surface/retry them instead of assuming success; also apply the same
change to the other occurrence referenced around SwapAliasAsync usage (lines
~142-165) and to CreateLookupIndexAsync as needed to coordinate the atomic
remove+add behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 8dbe88f0-5ff2-4587-af03-ff4271faf6ac
📒 Files selected for processing (1)
src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs
…onments The staging index + alias swap pattern introduced in #3098 fails in environments that already have a concrete index with the alias name. This adds a migration path that reindexes data into a timestamped backing index, removes the legacy index, and creates the alias. Also makes enrich policy creation idempotent (delete-before-put) and adds a last_updated fallback in the sitemap reader for documents indexed before the content date pipeline existed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What
Replace the delete-then-reindex flow in
ContentDateEnrichmentwith a staging index + atomic alias swap so there is never a window where the lookup index is empty.Why
The previous sequence called
DeleteLookupContentsAsync(which emptied the only durable lookup) beforeReindexToLookupAsyncwrote a replacement. If the reindex failed, the lookup was gone.How
_lookupIndexto_lookupAlias— the stable name is now an Elasticsearch alias over timestamped backing indicesSyncLookupIndexAsynccreates a fresh staging index, reindexes into it, refreshes, then atomically swaps the alias viaPOST /_aliasesEnsureLookupIndexAsyncresolves the alias viaGET /_alias/{name}and creates a backing index + alias if none existsDeleteLookupContentsAsync(no longer needed)Test plan
dotnet build— 0 warnings, 0 errors)InitializeAsynccreates a timestamped backing index and aliasSyncLookupIndexAsynccreates a new staging index, swaps the alias, and deletes the old index🤖 Generated with Claude Code