Fix SPDX 3 roundtrip: preserve elements, type prefixes, property names#172
Fix SPDX 3 roundtrip: preserve elements, type prefixes, property names#172vpetersson merged 3 commits intomasterfrom
Conversation
…roperty names The parse→modify→write roundtrip for SPDX 3 JSON-LD had three critical issues causing data loss and API rejection: 1. Elements dropped: Only 8 element types were parsed; all others (security, build, licensing, lifecycle) were silently discarded. Now preserved as passthrough elements that survive the roundtrip verbatim. 2. Type prefixes stripped: spdx_tools converter outputs "Package" but the SPDX 3.0.1 context requires "software_Package". The sbomify API rejected these with "No packages found". Now restored on write. 3. Property names wrong: Parser only read unprefixed names (packageVersion) but real SBOMs (Yocto/OpenEmbedded) use the spec-correct software_-prefixed form (software_packageVersion). All 325 packages in a Yocto SBOM had zero fields populated. Now accepts both forms on read and emits prefixed on write. Also fixes: @type→type and @id→spdxId key normalization, standardName property name, SoftwareAgent type handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes critical data loss issues in the SPDX 3 JSON-LD parse→modify→write roundtrip, enabling proper handling of large-scale SBOMs (e.g., 62K-element Yocto SBOMs). The changes implement passthrough element preservation, type prefix restoration, and correct property naming for software-profile properties.
Changes:
- Implemented passthrough element storage for unrecognized SPDX 3 element types (security, build, licensing, lifecycle, etc.) to prevent data loss during roundtrip
- Added type prefix restoration logic to convert model class names back to JSON-LD type names (e.g.,
Package→software_Package) - Enhanced parser to accept both unprefixed and
software_-prefixed property names, and writer to output spec-correct prefixed forms - Added 13 new comprehensive tests covering passthrough preservation, type prefixes, property naming, and roundtrip integrity
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| sbomify_action/spdx3.py | Core implementation: added SoftwareAgent import, type/property mappings, passthrough element handling in parser, normalization logic in writer |
| tests/test_spdx3.py | Added two test classes (TestPassthroughAndTypePrefixes, TestSoftwarePrefixedProperties) with 13 new tests verifying roundtrip preservation and property handling |
| tests/test-data/spdx3_multi_type.json | New test fixture with diverse SPDX 3 element types (security, build, licensing, etc.) to validate passthrough and property naming |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add _normalize_passthrough_element() to normalize @type→type on passthrough elements while preserving blank-node @id values - Add standard→standardName rename in _normalize_serialized_element() - Add explicit test assertions for blank-node @id preservation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Replace dynamic _passthrough_elements attribute with explicit Spdx3Payload subclass that defines passthrough_elements field - Update _parse_agent() return type to include SoftwareAgent - Update parse_spdx3_file/parse_spdx3_data return types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Fixes three critical data loss issues in the SPDX 3 JSON-LD parse→modify→write roundtrip:
software_PackagebecamePackagein output, causing the sbomify API to reject with "No packages found". Now restored on write.packageVersion) but real SBOMs use the spec-correctsoftware_-prefixed form (software_packageVersion). All 325 packages in a Yocto SBOM had zero fields populated (version, download location, homepage, purpose all lost). Now accepts both forms on read and emits prefixed on write.Also fixes:
@type→type/@id→spdxIdkey normalization,standardNameproperty name,SoftwareAgenttype handling.Closes #170
Test plan
🤖 Generated with Claude Code