Skip to content

Conversation

@rahulgoswami
Copy link
Contributor

@rahulgoswami rahulgoswami commented Nov 30, 2025

https://issues.apache.org/jira/browse/SOLR-17725

Description

Provide a /admin/cores api (action=UPGRADECOREINDEX) to upgrade older lucene segments to latest version (by targeted reindexing). Calling this REST endpoint on an index created in version X-1 (assuming you are on Solr version X) would reindex documents in any segments of the older version (X-1) so that they are re-written in version X. This would help prepare the index for when Solr is upgraded to X+1 without having to recreate the index from source (as is the requirement today).

Solution

This API helps to upgrade an index where all fields are either stored or docValues enabled or copy field target.

Here's the algorithm:

  1. Set the merge policy as LatestVersionMergePolicy (which is a filter merge policy that delegates to whichever policy is actual defined in solrconfig or the TieredMergePolicy by default). This policy filters out any segments with minVersion=X-1 and only allows the latest version (X) segments to participate in merges. This ensures that the older version segments don't merge and "pollute" the index over the course of the reindexing.

  2. Iterate over the segments and for any segment which is a pure X-1 version segment (result of flush; minVersion=X-1, version=X-1) or a result of merge between version X-1 and X segments (minVersion=X-1, version=X), reindex all live documents in the segment.

  3. commit and validate that all segments in the index have been converted to version X.

  4. Reset the merge policy back to the original merge policy.

This works because after the change in apache/lucene#14607 (and the backport to 10x) opening an index no longer cares about which version the index was originally created in. Lucene just needs all of the segments to be of version X or later in order to be able to open the index in version X+1 (even though the index could have been originally created in version X-1)

[AI Coding Tools Disclosure: Only use of AI coding tools was in the tests (UpgradeCoreIndexActionTest.java). Claude Code and OpenAI Codex were used to generate, review, iterate and fix the tests mentioned in the "Tests" section. The same were also manually reviewed thereafter for correctness. ]

Tests

  1. For an index with below segments:
    seg1 - pure 9x
    seg2 - pure 10x
    seg3 - minVersion 9x, version 10x (result of merge between 9x and 10x segements)

seg1 and seg3 get reindexed into new 10x segment, and get deleted. seg2 remains untouched (as expected).

  1. DocValues-only fields are preserved during reindexing (this is tested specifically since we are reading docs from segments and decorating the SolrInputDocument with docValues thereafter)

  2. older segments get deleted by the end of the process

  3. async mode works fine. Tested by calling /admin/cores?action=UPGRADECOREINDEX&async=request_id. And later calling action=REQUESTSTATUS&requestid=request_id

  4. For an index where all segments belong to the latest major version, calling the UPGRADECOREINDEX API is a no-op, and returns "upgradeStatus":"NO_UPGRADE_NEEDED" in the response.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

@rahulgoswami rahulgoswami changed the title CoreAdmin API to upgrade an index in-place to facilitate Solr upgrade across major versions SOLR-17725: CoreAdmin API to upgrade an index in-place to facilitate Solr upgrade across major versions Nov 30, 2025
@rahulgoswami rahulgoswami marked this pull request as draft November 30, 2025 06:24
@rahulgoswami
Copy link
Contributor Author

rahulgoswami commented Nov 30, 2025

First (half-baked) draft. WIP.

@github-actions github-actions bot removed the cat:index label Dec 9, 2025
@rahulgoswami
Copy link
Contributor Author

rahulgoswami commented Dec 23, 2025

Looking good. Basic (manual) tests with /admin/cores?action=UPGRADECOREINDEX working fine. Will publish response with segment status before and after index upgrade. Also need to add tests.

@rahulgoswami rahulgoswami marked this pull request as ready for review December 23, 2025 06:50
… has been renamed. It's being used in this feature..
@rahulgoswami
Copy link
Contributor Author

rahulgoswami commented Dec 31, 2025

Sample response in synchronous mode (/admin/cores?action=UPRGADECOREINDEX&core=techproducts)

{
  "responseHeader":{
    "status":0,
    "QTime":10193
  },
  "core":"techproducts",
  "numSegmentsEligibleForUpgrade":2,
  "numSegmentsUpgraded":2,
  "upgradeStatus":"UPGRADE_SUCCESSFUL"
}

Sample response in async mode ("/admin/cores?action=UPRGADECOREINDEX&core=techproducts&async=request_id" and checking status with action=REQUESTSTATUS)

{
  "responseHeader":{
    "status":0,
    "QTime":0
  },
  "STATUS":"completed",
  "response":{
    "core":"techproducts",
    "numSegmentsEligibleForUpgrade":2,
    "numSegmentsUpgraded":2,
    "upgradeStatus":"UPGRADE_SUCCESSFUL"
  }
}

@rahulgoswami
Copy link
Contributor Author

@dsmiley @broustant @janhoy This is ready for review. Would appreciate your inputs. Thanks!

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 4, 2026
@github-actions github-actions bot added the tests label Jan 6, 2026
Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review.

Thanks for working on this!

@broustant you may find some of this rather familiar from index encryption

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class UpgradeCoreIndex extends CoreAdminAPIBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always add at least a line of javadoc to new public classes.


@Schema(description = "Request ID to track this action which will be processed asynchronously.")
@JsonProperty
public String async;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @gerlowskija might review the V2 API spect.

But what I notice here is this async --- I'm suspicious as I figure it'd be handled in a more general way without each command needing to redundantly specify it, which is basically all commands.

coreName,
requestBody.async,
"upgrade-index",
() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a stylistic point but this lambda is way too big; please use another technique like define a method, and call it with a one-liner lambda.

() -> {
try (SolrCore core = coreContainer.getCore(coreName)) {

log.info("Received UPGRADECOREINDEX request for core: {}", coreName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging at request level seems like a framework infrastructure level concern (probably already logged), not one unique to this particular command

int numSegmentsEligibleForUpgrade = 0, numSegmentsUpgraded = 0;
try {
iwRef = core.getSolrCoreState().getIndexWriter(core);
if (iwRef != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely this can't be null?!

Comment on lines +198 to +200
if (iwRef != null) {
IndexWriter iw = iwRef.get();
if (iw != null && originalMergePolicy != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again; null checks needless

if (resolvedChain != null) {
return resolvedChain;
} else {
log.error(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should throw an error (and not log)

Comment on lines +242 to +262
// Try to find chain configured in /update handler
String updateChainName = null;
SolrRequestHandler reqHandler = core.getRequestHandler("/update");

NamedList initArgs = ((RequestHandlerBase) reqHandler).getInitArgs();

if (initArgs != null) {
// Check invariants first
Object invariants = initArgs.get("invariants");
if (invariants instanceof NamedList) {
updateChainName = (String) ((NamedList) invariants).get(UpdateParams.UPDATE_CHAIN);
}

// Check defaults if not found in invariants
if (updateChainName == null) {
Object defaults = initArgs.get("defaults");
if (defaults instanceof NamedList) {
updateChainName = (String) ((NamedList) defaults).get(UpdateParams.UPDATE_CHAIN);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're trying too hard to find the default chain. If it's actually this hard, then this code should live elsewhere and we just call it. But I'm still suspicious. If it wasn't so late I'd go look.


private boolean isIndexUpgraded(SolrCore core) throws IOException {

try (FSDirectory dir = FSDirectory.open(Path.of(core.getIndexDir()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah woah... lets instead get the Directory from the factory from the core. See SolrIndexSplitter.


private void doCommit(SolrCore core) throws IOException {
try (LocalSolrQueryRequest req = new LocalSolrQueryRequest(core, new ModifiableSolrParams())) {
CommitUpdateCommand cmd = new CommitUpdateCommand(req, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a little EOL comment on what "false" means

@rahulgoswami
Copy link
Contributor Author

@dsmiley Thanks for an initial review. I am working on addressing the review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:api client:solrj documentation Improvements or additions to documentation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants