Append-only notes for agents working in microplex core.
- Generalization push moved benchmark logic deeper into core.
- Added and adopted shared target-layer modules:
src/microplex/targets/bundles.pysrc/microplex/targets/reweighting.pysrc/microplex/targets/benchmarking.pysrc/microplex/targets/provider.pysrc/microplex/targets/spec.py
- Core now owns:
- shared benchmark metric math
- comparison math on common target intersections
- benchmark result/comparison/suite serialization
- payload-based slice evaluation helpers
- result-oriented slice evaluation and suite builders
- Fixed target-record aggregation so duplicate target names across slices no longer silently last-writer-win.
- Unified zero-target relative-error handling by routing reweighting diagnostics through shared
relative_error_ratio(...). - Removed core-owned grouped-summary metadata defaults. Country packs must now pass
group_fieldsexplicitly. - Public assigner partition APIs were added in
src/microplex/geography.pyso UK no longer reaches into private assigner internals. - Legacy US-specific targets DB implementation was moved out of core into
microplex-us; coretargets/database.pyis now compatibility territory and should not expand.
- Do not accept changes that reintroduce country-specific assumptions into core.
- Benchmark math changes require explicit test coverage.
- Prefer adding focused regressions over broad integration-only confidence.
- UK and US still diverge at the PolicyEngine execution layer.
- Suite-level metrics and composite-loss semantics are not identical across countries; do not present them as interchangeable without context.
- Any future change to benchmark/result normalization should be checked in both country packs immediately.
- Added a shared benchmark-artifact manifest contract in:
src/microplex/targets/artifacts.py
- Core now exposes:
validate_benchmark_artifact_manifest(...)assert_valid_benchmark_artifact_manifest(...)BenchmarkArtifactValidationResult
- The contract is intentionally small:
- required top-level manifest structure
- explicit summary section (
benchmarkorpolicyengine_harness) - required summary keys for benchmarked bundles
- existence checks for referenced artifact files
- This is meant to centralize enforcement, not force US and UK into one identical payload schema.