Add source-weight diagnostics and PUF clone-dominance gate

## Context

PolicyEngine/policyengine-us-data#1151 exposed a class of failure where support/clone rows can dominate final Enhanced CPS weights even when headline household totals look plausible. Microplex uses PUF and other donor sources differently, but the same production risk exists: a candidate can pass target-fit checks while final calibrated mass is concentrated in a support source or clone-like donor subset.

Microplex should keep source/provenance diagnostics out of the exported PolicyEngine H5, but we still need a separate artifact and release gate for this.

## Proposed Microplex behavior

- During canonical mp build/scoring, write a sidecar source-weight diagnostics artifact outside the H5.
- Summarize final calibrated mass by construction source or source class, at least:
  - household count and weight share
  - person count and weight share
  - tax-unit count and weight share where available
  - positive-weight row counts
  - top source/source-class concentration
- Include PUF-specific diagnostics where supported by construction metadata:
  - original CPS/ASEC-like rows vs PUF donor/support rows
  - any explicit support-clone or donor-replay subset
  - top-tail/fixed-spine rows as a distinct fixed source, not mixed with ordinary PUF support
- Add an mp artifact gate that fails when source/support rows dominate beyond configured thresholds unless explicitly marked as an experimental build.
- Keep all source/provenance fields out of the exported PolicyEngine model variables; diagnostics live in sidecars and manifests only.

## Acceptance criteria

- A canonical build writes a stable JSON sidecar, e.g. `source_weight_diagnostics.json`, referenced from `manifest.json`.
- `mp300k_artifact_gates` consumes the sidecar and reports a `source_weight_diagnostics` gate.
- The gate is required for replacement candidates and unmeasured/failing if the sidecar is missing.
- Tests cover a passing balanced-source payload and a failing clone/support-dominance payload.
- Export guards continue to reject source diagnostic variables in the H5.

Related us-data precedent: https://github.com/PolicyEngine/policyengine-us-data/pull/1151


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add source-weight diagnostics and PUF clone-dominance gate #69

Context

Proposed Microplex behavior

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add source-weight diagnostics and PUF clone-dominance gate #69

Description

Context

Proposed Microplex behavior

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions