Add snapshot golden tests#1446
Conversation
db6f7fb to
b5a0b52
Compare
e8bcea9 to
0115ba2
Compare
22a0cff to
0658445
Compare
There was a problem hiding this comment.
Pull request overview
Adds snapshot “golden” testing and ABI tripwires to detect breaking changes in the on-disk guest↔host snapshot format, with CI automation to pull/regenerate published baselines from GHCR and validate cross-CPU portability.
Changes:
- Introduces a custom-harness
snapshot_goldenstest binary (libtest-mimic) plus fixtures/checks for snapshot load/round-trip validation against staged OCI image layouts. - Adds compile-time snapshot ABI tripwires pinning key format-defining constants and struct/layout offsets.
- Extends CI/Justfile tooling to pull/publish/regenerate goldens via GHCR (oras), including a
regen-goldenslabel-controlled path and cross-CPU verification.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/tests/rust_guests/simpleguest/src/main.rs | Adds guest/host echo and round-trip functions plus heap-pattern helpers used by golden fixtures/checks. |
| src/hyperlight_host/tests/snapshot_goldens/platform.rs | Detects platform (hypervisor/cpu/profile) and builds golden tag names for staging/verification. |
| src/hyperlight_host/tests/snapshot_goldens/oci.rs | Locates the staged OCI image-layout directories under target/snapshot-goldens/. |
| src/hyperlight_host/tests/snapshot_goldens/main.rs | Implements the custom harness entrypoint for verify and generate subcommands. |
| src/hyperlight_host/tests/snapshot_goldens/goldens_version.rs | Defines GOLDENS_VERSION and compat version set for multi-major verification. |
| src/hyperlight_host/tests/snapshot_goldens/fixtures.rs | Defines canonical sandbox config + deterministic mutation sequence for golden generation. |
| src/hyperlight_host/tests/snapshot_goldens/checks.rs | Adds independent checks that load goldens and validate captured state + ABI wire formats. |
| src/hyperlight_host/tests/integration_test.rs | Adjusts heap sizing in a couple tests to account for added guest-function registration overhead. |
| src/hyperlight_host/src/sandbox/snapshot/tripwires.rs | Adds const-eval assertions pinning ABI-critical constants, offsets, and discriminants. |
| src/hyperlight_host/src/sandbox/snapshot/mod.rs | Wires the new tripwires module into the snapshot module. |
| src/hyperlight_host/src/sandbox/snapshot/file/mod.rs | Exposes snapshot format constants to support tripwires. |
| src/hyperlight_host/src/sandbox/snapshot/file/media_types.rs | Tightens docs/visibility for snapshot media-type and ABI version constants. |
| src/hyperlight_host/src/sandbox/snapshot/file/config.rs | Adds targeted load-validation tests and JSON schema “pinned” round-trip tests. |
| src/hyperlight_host/src/sandbox/snapshot/file_tests.rs | Adds tests for from_snapshot config behavior and init-data permission round-tripping. |
| src/hyperlight_host/Cargo.toml | Adds libtest-mimic and registers the snapshot_goldens custom harness test target. |
| Justfile | Adds snapshot-goldens pull/generate/verify recipes and ensures clippy covers the custom harness target. |
| docs/snapshot-versioning.md | Documents snapshot format versioning, enforcement mechanisms, and regen/publish workflows. |
| docs/github-labels.md | Documents the regen-goldens workflow-affecting label. |
| Cargo.lock | Locks new dependencies introduced for the golden test harness. |
| .github/workflows/ValidatePullRequest.yml | Adds label-driven mode selection and introduces cross-CPU verification for regen path; disables fail-fast. |
| .github/workflows/RegenSnapshotGoldens.yml | Adds a workflow to generate/publish versioned golden OCI layouts (with completion marker) to GHCR. |
| .github/workflows/dep_snapshot_cross_verify.yml | Adds reusable workflow to verify regenerated goldens across CPU vendors. |
| .github/workflows/dep_build_test.yml | Adds pull-vs-regen golden verification steps (oras + just recipes) and uploads regen artifacts. |
danbugs
left a comment
There was a problem hiding this comment.
Mostly LGTM, just 2 minor comments--great work!! Let's just get 1 more person to look at this as it is a pretty big PR.
syntactically
left a comment
There was a problem hiding this comment.
I had time only for a pretty quick skim over this.
add compile-time tripwires that pin every value defining the format (SNAPSHOT_ABI_VERSION, the media-type strings, the OCI layout version, HyperlightPEB field offsets and size, and the OutBAction ports), helping develpers catch
Why are the PEB fields important? I think the are used only during creation of the original snapshot from a binary? If they are not important I would like to be careful not to constrain on them for no reason.
make sure snapshots taken on AMD are usable on Intel, and vice versa
Are we certain already that this is a goal? I thought we purposefully punted on this and compatibility across hypervisors until we had a better sense of whether it would be onerous.
2219634 to
9ed1e1b
Compare
You're right, only the size of the PEB struct matters, the fields themselves are never read back by the host.
Addressed in #1587 |
c9b2a9d to
5e0d25e
Compare
simongdavies
left a comment
There was a problem hiding this comment.
I've not fully reviewed this but I have a couple of suggestions:
Can we move all the login that is embedded as script in the PR workflows into Justfile recipes and/or scripts so that they can easily beused/testd locally?
Do we want to run this on every PR? Can we run nightly and/or on release and then open a release blocker issue if it fails? We might also want running this as a gate to creating a new release?
5455de1 to
1a30967
Compare
It should be very lightweight on each PR, it's just a couple of small tests. I think it's worth it to catch early rather than to discover after the fact |
1a30967 to
bd27c5a
Compare
|
Pushed some updates. I'd like feedback on the following
|
bd27c5a to
571a802
Compare
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
571a802 to
bfdc6f1
Compare
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
OK maybe we should add cache entries int the job for the golden images? or perhaps if add the oras cli install to the image then we should pull all the OCIs onto the CI image that way we dont waste time each run waiting for the pulls? Also means that if they change the changes will still get pulled and everything will work. |
Maybe but perhaps we should add the tools as pre-reqs in the docs and then list the pull command/script needed in the docs also? |
Urghhh , maybe we can make this debug only and no-op the golden tests for release profile?? |
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
I am not worried about this, the images are really very small and should pull super quick. If we really want to, we could do this in a follow-up. Installing oras into the image is a good idea though.
Added to docs in snapshot-versioning.md
Moved to |
This PR adds tests to help notice breaking changes in the guest<->host ABI. This matters because any change there breaks every snapshot already persisted to disk (bad!).
When a change to the format is intentional, the author bumps the version and applies the
regen-goldenslabel. A workflow then regenerates the goldens against the branch and pushes the new set to GHCR.See docs/snapshot-versioning.md for details.