Skip to content

Conversation

@martinfrancois
Copy link

I fixed a concurrency bug in NamespacedHierarchicalStore.computeIfAbsent where the defaultCreator function was executed while holding the store's internal map lock. Under parallel execution, this could cause threads using the store to block each other and temporarily see a missing or incorrectly initialized state for values created via computeIfAbsent.

Concretely, the changes are:

  • I reworked NamespacedHierarchicalStore.computeIfAbsent so that:

    • It wraps defaultCreator in a MemoizingSupplier and installs that supplier via the map operation.
    • The supplier is only evaluated after the ConcurrentHashMap update has completed, so user code is no longer run while holding the internal map lock.
    • It uses the previously observed StoredValue (storedValue) and the current mapping in the store (oldStoredValue) to decide when it is safe to install a new value:
      • If there is no mapping in the current store (oldStoredValue == null), it installs a new value.
      • If the current mapping is precisely the one previously observed (oldStoredValue == storedValue, by reference equality), it overwrites that mapping (for example, when it is evaluated to null).
      • Otherwise, it keeps the existing mapping because another thread has already installed a different value.
    • If evaluating the new value fails, it removes the newly installed StoredValue from the map so that subsequent calls can retry.
    • This aligns computeIfAbsent with the behavior of the deprecated getOrComputeIfAbsent, preserving the intended "one initialization per key" semantics while avoiding execution of user code inside the map's critical section.
  • I introduced a CollidingKey helper type in the tests to deterministically force different keys into the same ConcurrentHashMap bucket.

  • I added a regression test computeIfAbsentDoesNotDeadlockWithCollidingKeys in NamespacedHierarchicalStoreTests:

    • It starts two threads that call computeIfAbsent for two different CollidingKey instances in the same namespace.
    • It uses latches so that the first thread's defaultCreator waits for the second, while the second only proceeds after the first has started.
    • It asserts that the first thread does not time out, ensuring that computeIfAbsent does not block subsequent computations on colliding keys.
  • I added an equivalent test getOrComputeIfAbsentDoesNotDeadlockWithCollidingKeys for the deprecated getOrComputeIfAbsent:

    • It mirrors the same scenario with CollidingKey and latches.
    • It demonstrates that getOrComputeIfAbsent does not exhibit this deadlock pattern, because it already evaluates defaultCreator outside the critical section.
    • It documents the behavioral difference that previously existed and why aligning computeIfAbsent with getOrComputeIfAbsent is the correct fix.
  • I added computeIfAbsentOverridesParentNullValue to verify parent/child store semantics purely in terms of computeIfAbsent, put, and get:

    • It stores a null value in the parent via parent.put(namespace, key, null).
    • It verifies that the child initially sees null via child.get(namespace, key).
    • It then calls child.computeIfAbsent(namespace, key, __ -> "value") and asserts that the call returns "value" and that subsequent child.get(namespace, key) also returns "value".
    • This ensures that a null value in the parent is treated as "logically absent" for computeIfAbsent and does not prevent the child from installing its own non-null value.
  • I verified the behavior of the new tests against both the old and new implementations:

    • With the original computeIfAbsent implementation, computeIfAbsentDoesNotDeadlockWithCollidingKeys fails, while getOrComputeIfAbsentDoesNotDeadlockWithCollidingKeys and the existing tests (including simulateRaceConditionInComputeIfAbsent) pass.
    • With the updated computeIfAbsent implementation, all tests pass.
    • This shows that the new tests specifically capture the concurrency issue in computeIfAbsent and that the fix restores the expected behavior without changing getOrComputeIfAbsent.
    • simulateRaceConditionInComputeIfAbsent did not catch this issue because it only exercises contention on a single key and relies on ConcurrentHashMap's per-key atomicity; it does not force different keys into the same bucket or run user code in a way that re-enters the store while the map lock is held, so the problematic interaction never occurs in that test
  • This change should also fix the flakiness observed in AssertJ's SoftAssertionsExtension_PER_CLASS_Concurrency_Test (SoftAssertionsExtension_PER_CLASS_Concurrency_Test flaky test assertj/assertj#1996). In that scenario, two tests run in parallel, and SoftAssertionsExtension uses the JUnit ExtensionContext store (backed by NamespacedHierarchicalStore) to obtain per-test AssertionErrorCollector instances via computeIfAbsent. With the old implementation, concurrent initialization under the map lock could block or race so that collectors were sometimes never registered, and the engine-level statistics occasionally reported started(0) / failed(0) instead of started(2) / failed(2). With the updated computeIfAbsent, the store computes each collector outside the lock and only once per key, so the collectors are always stored and the test statistics are stable even when the tests execute in parallel.

Fixes #5171


I hereby agree to the terms of the JUnit Contributor License Agreement.


Definition of Done

Copy link
Contributor

@Pankraz76 Pankraz76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great fix. Precisely executed and documented without any flaw, thanks a lot.

+1

@testlens-app
Copy link

testlens-app bot commented Dec 8, 2025

✅ All tests passed ✅

🏷️ Commit: e7357f7
▶️ Tests: 62304 executed
⚪️ Checks: 15/15 completed

@martinfrancois
Copy link
Author

martinfrancois commented Dec 8, 2025

You're welcome @Pankraz76, thanks as well for the praise and the review, I really appreciate it! :)

Copy link
Contributor

@Pankraz76 Pankraz76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 feat. complete.

Well done, thanks again for dedication leading to incrementation.

Now its just about polish, giving optional potential dedication - striving for excellence.

But also this is danger land, might better to extract this into clean PR afterwards. Scout principle is nice, still tend ppl. to tilt on this, likely to be overwhelmed.

@SuppressWarnings("ReferenceEquality")
@API(status = MAINTAINED, since = "6.0")
public <K, V> Object computeIfAbsent(N namespace, K key, Function<? super K, ? extends V> defaultCreator) {
Preconditions.notNull(defaultCreator, "defaultCreator must not be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Preconditions.notNull(defaultCreator, "defaultCreator must not be null");
notNull(defaultCreator, MANDATORY_DEFAULT_CREATOR);

imho, useless coupling, context and noisy burden.

Copy link
Author

@martinfrancois martinfrancois Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do this (Preconditions.notNull => notNull) too, but since it was like this in the code already I wasn't sure if it was a convention here, as I imagine it could potentially be confused with requireNonNull or it could be intentional to make it clear it's a precondition. Could a maintainer please give a second opinion here? I'd be glad to change it.

rejectIfClosed();
return computedValue;
});
return Preconditions.notNull(defaultCreator.apply(key), "defaultCreator must not return null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Preconditions.notNull(defaultCreator.apply(key), "defaultCreator must not return null");
return notNull(defaultCreator.apply(key), MANDATORY_DEFAULT_CREATOR_VALUE);

@API(status = MAINTAINED, since = "1.13.3")
public final class NamespacedHierarchicalStore<N> implements AutoCloseable {

private final AtomicInteger insertOrderSequence = new AtomicInteger();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final String MANDATORY_DEFAULT_CREATOR_VALUE = "defaultCreator must not return `null`";
private static final String MANDATORY_DEFAULT_CREATOR = "defaultCreator must not be `null`";
private final AtomicInteger insertOrderSequence = new AtomicInteger();

Copy link
Author

@martinfrancois martinfrancois Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there aren't that many constants in the code for repeating strings like this I'm not sure this is on purpose - I'd agree with that change, can a maintainer please give a second opinion here?

Copy link
Contributor

@Pankraz76 Pankraz76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2cts.

going fully functional separating the concerns (SoC/SRP).

`computeIfAbsent` previously invoked `defaultCreator` while holding the
store's internal map lock. Under parallel execution this could cause
threads using the store to block each other and temporarily see missing
or incorrectly initialized state for values created via
`computeIfAbsent`.

The implementation now wraps `defaultCreator` in a `MemoizingSupplier`
and installs that supplier via the map operation, evaluating it only
after the update has completed. This avoids running user code while
holding the lock and aligns `computeIfAbsent`'s behavior with the
deprecated `getOrComputeIfAbsent`, preserving the intended "one
initialization per key" semantics.

Issue: junit-team#5171
Signed-off-by: martinfrancois <f.martin@fastmail.com>
@martinfrancois
Copy link
Author

You’re welcome, and thanks again, @Pankraz76, for taking another careful look.
Your suggestions are great, but as you pointed out, they mostly target existing code that could be cleaned up independently. I’d like to keep this PR focused on the bugfix so we do not drag out the review with additional refactoring.
Once this is merged, I am happy to follow up with a separate cleanup PR. I do not want to delay AssertJ being able to update to JUnit 6 again any longer than necessary 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concurrency problem in NamespacedHierarchicalStore#computeIfAbsent

2 participants