Document AI contribution rules and modernise hashing + test suite in Chronicle-Algorithms #32
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds contributor guidance (for humans and AI tools), replaces the Markdown README with a richer AsciiDoc version, bumps the shared BOM, and refactors core hashing and test code to be more robust and maintainable while preserving behaviour.
Functional changes
Hashing implementations
CityHash 1.1 (
CityHash_1_1)Replaces the static
NATIVE_CITYsingleton with anativeCity()helper that selects the appropriate implementation based on endianness, improving clarity while keeping behaviour the same.Refactors the
cityHash64long-path logic to:vFirst/vSecond,wFirst/wSecond,x,y, andzclearer and more obviously aligned with the reference algorithm.AsLongHashFunctionSeeded.voidHashis now a non-transient field, so the precomputed hash for empty input is preserved across serialisation. Previously, it was recomputed or defaulted after deserialisation; now serialised instances retain their originalvoidHashvalue.MurmurHash3 (
MurmurHash_3)Replaces the static
NATIVE_MURMURwith anativeMurmur()helper, mirroring the CityHash change and centralising native-endianness selection.Simplifies the main
hashimplementation:blockOffsetvariable inside the 16-byte loop for clearer offset management.tailK1(...)andtailK2(...), each with a compactswitchthat is explicitly marked with@SuppressWarnings("fallthrough").switchand reduces duplication while keeping the hash output stable.As with CityHash,
AsLongHashFunctionSeeded.voidHashis no longer transient, so empty-input hash precomputation survives serialisation.xxHash (
XxHash_r39)Replaces static
NATIVE_XXwith anativeXx()helper to choose the correct endianness implementation.Refactors
hashLong,hashInt, andhashShortto:nativeXx().toLittleEndian(...)explicitly.hashLongclearer (compute mixed value, then initialisehash, then fold).These changes are intended to maintain bit-for-bit compatibility while improving readability and making the endianness boundary more explicit.
Locking behaviour test
LockingStrategyTestIn
testUpdateLockUpgradeToWriteLock, the write-lock downgrade on the first executor is now awaited via.get():This ensures the downgrade completes before subsequent assertions, reducing the chance of flaky behaviour due to out-of-order task completion in multi-threaded tests.
Non-functional changes
New contributor guidance
AGENTS.mdNew, company-wide guidance for AI agents, bots and humans contributing to Chronicle / OpenHFT projects:
Language and character-set rules:
Javadoc guidelines emphasising:
Build and test requirements:
mvn -q verifymust pass before raising a PR.Commit/PR etiquette and what to ask reviewers (traceability to decisions, requirements, docs).
Real-time documentation practices (keeping AsciiDoc, tests and code in sync).
AI-agent-specific rules (avoid redundancy, always review AI output, keep to ISO-8859-1, etc.).
Company-wide tag taxonomy (FN, NF-P, NF-S, NF-O, TEST, DOC, OPS, UX, RISK, ALL-*) and a standard decision-record template.
CLAUDE.mdRepository-local guide for Claude Code:
Overview of Chronicle-Algorithms (hashing, bit sets, bytes access, off-heap locks).
Build, test and coverage targets (JaCoCo line/branch coverage).
Architecture summary:
LongHashFunction.Access/Accessorabstraction for raw bytes.BitSet/ frames / reusable wrappers.Dependency overview and test stack (JUnit 4/5, Mockito).
These documents are documentation only and do not affect runtime behaviour, but they should materially improve the quality and consistency of future changes.
Documentation overhaul
README.md→README.adocReplace the short Markdown README with a comprehensive AsciiDoc README:
Adds project description, features, and usage examples for:
LongHashFunctionusage on Strings, primitives, arrays, and off-heap memory).BitSetFrame,ReusableBitSet,SingleThreadedFlatBitSetFrame,ConcurrentFlatBitSetFrame).Access/Accessorpattern and supported sources).Documents Maven coordinates and points to Maven Central for current version.
Adds standard badges for Maven Central, Javadoc and Apache 2.0 licence.
Includes build instructions and links to related Chronicle projects.
This aligns the project’s entry-point documentation with other Chronicle modules and better reflects current capabilities.
Dependency alignment
pom.xmlnet.openhft:third-party-bomfrom3.27ea5to3.27ea7.Code quality and style
General clean-ups
MemoryUnit,SingleThreadedFlatBitSetFrame: comment alignment and section markers for readability only.HotSpotStringAccessor.handle:Removes an unnecessary
(T)cast:Behaviour is unchanged; the method still returns the underlying backing array for the String on supported HotSpot versions.
LongHashFunction: adjust annotation placement on array parameters (e.g.boolean @NotNull [] input) to match modern style and static-analysis expectations.TryAcquireOperations: replace anonymous inner classes with concise method references (e.g.LockingStrategy::tryLock), without changing semantics.Test and tooling improvements
Hashing reference data
Add reference Java files under
src/test/resources/net/openhft/chronicle/algo/hashing/reference:City64_1_1_Test.javaXxHashTest.javaThese contain known-good output arrays for:
They serve as reference fixtures for validating our implementations against upstream C reference code (and help justify the refactors to tail and endianness handling).
JUnit modernisation
Migrate older JUnit 3 style tests to JUnit 5 Jupiter:
Remove
extends TestCase.Use
@BeforeEach/@Testfrom Jupiter.Replace
assertEquals("msg", expected, actual)andinstanceofchecks with:assertTrue,assertFalse,assertEqualsfromorg.junit.jupiter.api.Assertions.assertInstanceOfwhere appropriate.Updated classes include (non-exhaustive list):
BitSetTestConcurrentFlatBitSetFrameTestDirectBitSetTestFlatBitSetAlgorithmTestReusableBitSetTestAccessTestAccessUtilitiesTestByteBufferAccessTestCharSequenceAccessTestRandomDataInputAccessTestRandomDataOutputAccessTestWriteAccessTest(assumptions now useorg.junit.jupiter.api.Assumptions.assumeFalse)Stronger test assertions
ReusableBitSetTest:BitSet.Bits, now asserts that the number of visited bits equalsbs.cardinality(), not just that the iterator runs. This gives stronger coverage of the iterator implementation.AccessUtilitiesTest:BytesStoreinFullaccess tests.assertInstanceOfinstead ofinstanceofchecks forHotSpotStringAccessorhandles.LockingStrategyTest:Randomness and performance in tests
City64MoreTest,MurmurHash3MoreTest,MurmurHash3Test:SecureRandom/RandomtoThreadLocalRandomfor generating test data, reducing overhead in long-running randomness tests while maintaining sufficient randomness for statistical checks.scoreSum / 500.0,(time / (double) timeCount) / 1e3) for improved accuracy.Misc test code hygiene
RandomDataOutputAccessTest.RandomDataOutputImplstaticto avoid accidental implicit references.offsetinBitSetFixture).List<Object[]>instead of rawArrayListin parameterised tests, escape#include <stdlib.h>as<...>in Javadoc).Risk & compatibility
Core hashing algorithms (CityHash, MurmurHash3, xxHash) have been refactored but are expected to remain bit-for-bit compatible. The new reference data and existing tests should catch deviations.
Making
voidHashnon-transient in seeded hash implementations is a behaviourally observable change for serialised instances:voidHashacross serialise/deserialise cycles.The BOM bump may change transitive dependency versions, but there are no intentional API-level changes in this module to depend on new behaviour.
All other changes are documentation, test, or style-only and should not affect production runtime behaviour.