Skip to content

Conversation

@leonfin
Copy link
Contributor

@leonfin leonfin commented Nov 23, 2025

IndexTrackingQueryObserver.afterIndexLookup() throws a NullPointerException when executing queries on partitioned regions during application startup. The issue occurs because the method attempts to access a ThreadLocal
  indexMap that was never initialized, causing the application to fail.

  Root Cause

  In org.apache.geode.cache.query.internal.IndexTrackingQueryObserver.java, the afterIndexLookup() method (line 107-110) retrieves a Map from a ThreadLocal without checking for null

  @Override
  public void afterIndexLookup(Collection results) {
      if (results == null)
{           return;       }
      Map indexMap = (Map) indexInfo.get();  // Line 107 - Can be NULL
      Index index = (Index) lastIndexUsed.get();
      if (index != null) {
          IndexInfo indexInfo = (IndexInfo) indexMap.get(...);  // Line 110 - NPE!
          if (indexInfo != null)
{               indexInfo.getResults().put(index.getRegion().getFullPath(), results.size());           }
      }
      // ...
  }

  The ThreadLocal indexMap is initialized in beforeIndexLookup() (lines 49-52), but in certain scenarios involving partitioned region queries, afterIndexLookup() can be called without beforeIndexLookup() having successfully
   completed, leaving the ThreadLocal uninitialized.

 Expected Behavior

  - afterIndexLookup() should handle cases where beforeIndexLookup() did not successfully initialize the ThreadLocal
  - Query execution should complete successfully
  - Application startup should not fail due to query observer issues

 Actual Behavior

  Application fails with:

  java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "indexMap" is null
          at org.apache.geode.cache.query.internal.IndexTrackingQueryObserver.afterIndexLookup(IndexTrackingQueryObserver.java:110)

  Full stack trace:
java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "indexMap" is null
          at org.apache.geode.cache.query.internal.IndexTrackingQueryObserver.afterIndexLookup(IndexTrackingQueryObserver.java:110)
          at org.apache.geode.cache.query.internal.CompiledComparison.singleBaseCollectionFilterEvaluate(CompiledComparison.java:502)
          at org.apache.geode.cache.query.internal.CompiledComparison.filterEvaluate(CompiledComparison.java:181)
          at org.apache.geode.cache.query.internal.CompiledComparison.filterEvaluate(CompiledComparison.java:227)
          at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:532)
          at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:53)
          at org.apache.geode.cache.query.internal.DefaultQuery.executeUsingContext(DefaultQuery.java:357)
          at org.apache.geode.internal.cache.PRQueryProcessor.executeQueryOnBuckets(PRQueryProcessor.java:248)
          at org.apache.geode.internal.cache.PRQueryProcessor.executeSequentially(PRQueryProcessor.java:214)
          at org.apache.geode.internal.cache.PRQueryProcessor.executeQuery(PRQueryProcessor.java:123)
          at org.apache.geode.internal.cache.PartitionedRegionQueryEvaluator.executeQueryOnLocalNode(PartitionedRegionQueryEvaluator.java:986)
          at org.apache.geode.internal.cache.PartitionedRegionQueryEvaluator.executeQueryOnRemoteAndLocalNodes(PartitionedRegionQueryEvaluator.java:376)
          at org.apache.geode.internal.cache.PartitionedRegionQueryEvaluator.queryBuckets(PartitionedRegionQueryEvaluator.java:493)
          at org.apache.geode.internal.cache.PartitionedRegion.doExecuteQuery(PartitionedRegion.java:2134)
          at org.apache.geode.internal.cache.PartitionedRegion.executeQuery(PartitionedRegion.java:2061)
          at org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:241)

Suggested Fix

  Add a null check in afterIndexLookup() method to handle uninitialized ThreadLocal:

  File: geode-core/src/main/java/org/apache/geode/cache/query/internal/IndexTrackingQueryObserver.java

  Change at lines 97-120:

@Override
  public void afterIndexLookup(Collection results) {
      if (results == null) {           // according to javadocs in QueryObserver, can be null if there           // is an exception           return;       }
      // append the size of the lookup results (and bucket id if its an Index on bucket)
      // to IndexInfo results Map.
      Map indexMap = (Map) indexInfo.get();
      // Guard against uninitialized ThreadLocal in partitioned queries
      if (indexMap == null) {           // beforeIndexLookup was not called or did not complete successfully           // This can occur in partitioned region query execution across buckets           return;       }
      Index index = (Index) lastIndexUsed.get();
      if (index != null) {
          IndexInfo indexInfo = (IndexInfo) indexMap.get(getIndexName(index, lastKeyUsed.get()));
          if (indexInfo != null) {               indexInfo.getResults().put(index.getRegion().getFullPath(), results.size());           }
      }
      lastIndexUsed.set(null);
      lastKeyUsed.set(null);
      if (th != null)
{           th.hook(3);       }
  }

  Rationale for Fix

  1. Defensive Coding: Other methods like getUsedIndexes() (line 150-154) already check for null before using the ThreadLocal
  2. Consistency: The pattern if (results == null) return; is already used at the method start
  3. Minimal Impact: The fix is a simple guard clause that doesn't change behavior when ThreadLocal is properly initialized
  4. Safe Degradation: When the ThreadLocal is null, gracefully skip tracking instead of crashing

 Workaround

  Until this bug is fixed, users can work around it by disabling query verbose mode:

  Option 1: JVM System Property
  -Dgemfire.Query.VERBOSE=false

Setting gemfire.Query.VERBOSE=false completely disables IndexTrackingQueryObserver creation and uses QueryObserverAdapter (with no-op methods) instead, avoiding the bug entirely.

 Additional Notes

  1. Related Code Locations:
    - IndexTrackingQueryObserver.java: Lines 42 (ThreadLocal declaration), 107-110 (bug location)
    - CompiledComparison.java: Lines 383-387 (beforeIndexLookup call), 500-504 (afterIndexLookup call in finally)
    - DefaultQuery.java: Line 781 (IndexTrackingQueryObserver instantiation)
  2. Similar Patterns: The getUsedIndexes(String fullPath) method already implements the null check pattern at line 209-212, which should be mirrored in afterIndexLookup()
 

For all changes, please confirm:

  • [X ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? GEODE-10526
  • [ X] Has your PR been rebased against the latest commit within the target branch (typically develop)? Yes
  • [X ] Is your initial contribution a single, squashed commit? Yes
  • [X ] Does gradlew build run cleanly? Yes
  • [X ] Have you written or updated unit tests to verify your changes? yes
  • [ X] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0? No

…llPointerException when indexMap ThreadLocal is uninitialized in partitioned region queries
@leonfin
Copy link
Contributor Author

leonfin commented Nov 23, 2025

@JinwooHwang could you please review?

@JinwooHwang JinwooHwang self-requested a review November 23, 2025 11:12
@JinwooHwang
Copy link
Contributor

@leonfin . Excellent fix for a critical edge case!

This properly addresses the NullPointerException in partitioned region queries where afterIndexLookup() can be called without a corresponding beforeIndexLookup() due to distributed execution across threads/nodes.

The fix follows the right principle: diagnostic/observability code (index tracking) should never cause functional failures. By gracefully handling the uninitialized ThreadLocal, queries now succeed even when index tracking data is unavailable.

The added test cases clearly demonstrate both the failure scenario and the fix. Well-documented inline comments explain exactly when this can occur in production.

I appreciate your great work ensuring query reliability!

@JinwooHwang
Copy link
Contributor

All verification checks have been initiated.

Copy link
Contributor

@JinwooHwang JinwooHwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All checks have passed. Merge approved.

@JinwooHwang JinwooHwang merged commit b36cad4 into apache:develop Nov 23, 2025
15 checks passed
@leonfin
Copy link
Contributor Author

leonfin commented Nov 24, 2025

@JinwooHwang thank you!

@JinwooHwang
Copy link
Contributor

Thank you for promptly identifying the defect and delivering an effective remediation, @leonfin. Your timely contribution is greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants