Skip to content

Fix SPDX 3 roundtrip: preserve elements, type prefixes, property names#172

Merged
vpetersson merged 3 commits intomasterfrom
spdx3-fix
Feb 24, 2026
Merged

Fix SPDX 3 roundtrip: preserve elements, type prefixes, property names#172
vpetersson merged 3 commits intomasterfrom
spdx3-fix

Conversation

@vpetersson
Copy link
Contributor

@vpetersson vpetersson commented Feb 24, 2026

Summary

Fixes three critical data loss issues in the SPDX 3 JSON-LD parse→modify→write roundtrip:

  • Elements dropped: Only 8 element types were parsed; all others (security, build, licensing, lifecycle — 57K of 62K elements in a Yocto SBOM) were silently discarded. Now preserved as passthrough elements.
  • Type prefixes stripped: software_Package became Package in output, causing the sbomify API to reject with "No packages found". Now restored on write.
  • Property names wrong: Parser only read unprefixed names (packageVersion) but real SBOMs use the spec-correct software_-prefixed form (software_packageVersion). All 325 packages in a Yocto SBOM had zero fields populated (version, download location, homepage, purpose all lost). Now accepts both forms on read and emits prefixed on write.

Also fixes: @typetype / @idspdxId key normalization, standardName property name, SoftwareAgent type handling.

Closes #170

Test plan

  • 45 SPDX3 unit tests pass (13 new tests added)
  • Full suite: 1722 passed, 0 failures
  • Lint + format clean
  • Verified with real 62K-element Yocto SBOM: all element types preserved, all package fields populated, correct property names in output

🤖 Generated with Claude Code

…roperty names

The parse→modify→write roundtrip for SPDX 3 JSON-LD had three critical issues
causing data loss and API rejection:

1. Elements dropped: Only 8 element types were parsed; all others (security,
   build, licensing, lifecycle) were silently discarded. Now preserved as
   passthrough elements that survive the roundtrip verbatim.

2. Type prefixes stripped: spdx_tools converter outputs "Package" but the
   SPDX 3.0.1 context requires "software_Package". The sbomify API rejected
   these with "No packages found". Now restored on write.

3. Property names wrong: Parser only read unprefixed names (packageVersion)
   but real SBOMs (Yocto/OpenEmbedded) use the spec-correct software_-prefixed
   form (software_packageVersion). All 325 packages in a Yocto SBOM had zero
   fields populated. Now accepts both forms on read and emits prefixed on write.

Also fixes: @type→type and @id→spdxId key normalization, standardName property
name, SoftwareAgent type handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 24, 2026 09:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes critical data loss issues in the SPDX 3 JSON-LD parse→modify→write roundtrip, enabling proper handling of large-scale SBOMs (e.g., 62K-element Yocto SBOMs). The changes implement passthrough element preservation, type prefix restoration, and correct property naming for software-profile properties.

Changes:

  • Implemented passthrough element storage for unrecognized SPDX 3 element types (security, build, licensing, lifecycle, etc.) to prevent data loss during roundtrip
  • Added type prefix restoration logic to convert model class names back to JSON-LD type names (e.g., Packagesoftware_Package)
  • Enhanced parser to accept both unprefixed and software_-prefixed property names, and writer to output spec-correct prefixed forms
  • Added 13 new comprehensive tests covering passthrough preservation, type prefixes, property naming, and roundtrip integrity

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
sbomify_action/spdx3.py Core implementation: added SoftwareAgent import, type/property mappings, passthrough element handling in parser, normalization logic in writer
tests/test_spdx3.py Added two test classes (TestPassthroughAndTypePrefixes, TestSoftwarePrefixedProperties) with 13 new tests verifying roundtrip preservation and property handling
tests/test-data/spdx3_multi_type.json New test fixture with diverse SPDX 3 element types (security, build, licensing, etc.) to validate passthrough and property naming

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add _normalize_passthrough_element() to normalize @type→type on
  passthrough elements while preserving blank-node @id values
- Add standard→standardName rename in _normalize_serialized_element()
- Add explicit test assertions for blank-node @id preservation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Replace dynamic _passthrough_elements attribute with explicit
  Spdx3Payload subclass that defines passthrough_elements field
- Update _parse_agent() return type to include SoftwareAgent
- Update parse_spdx3_file/parse_spdx3_data return types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@vpetersson vpetersson merged commit 4900a74 into master Feb 24, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SPDX 3 augmentation strips type prefixes and drops elements

2 participants