Skip to content

Conversation

@Jayant-kernel
Copy link

Problem

When scanning Nixpkgs packages, ScanCode was detecting licenses with
old SPDX identifiers like GPL-2.0 instead of the current standard
GPL-2.0-only. This caused mismatches between declared and detected
licenses.

What I Did

Added a simple mapping function to normalize deprecated SPDX identifiers
to their current versions. The normalization happens automatically during
license detection data cleanup.

Changes

  • Added normalize_spdx_identifier() in scanpipe/pipes/scancode.py
  • Integrated it into the license data cleaning flow
  • Added tests to verify the mappings work correctly

Testing

Tested normalization for:

  • GPL-2.0 → GPL-2.0-only
  • GPL-3.0 → GPL-3.0-only
  • LGPL-2.1 → LGPL-2.1-only

Fixes #1941

@Jayant-kernel Jayant-kernel force-pushed the fix-1941-improve-license-detection branch 3 times, most recently from 48366be to 66bc57a Compare January 29, 2026 13:29
The license detection was returning deprecated SPDX identifiers like
GPL-2.0 instead of the current GPL-2.0-only format. This caused
issues when scanning Nixpkgs packages.

Added a normalization function that maps old identifiers to current
ones. Integrated it into the license data cleaning flow so it works
automatically.

Fixes aboutcode-org#1941

Signed-off-by: Jayant Saxena <jayantmcom@gmail.com>
@Jayant-kernel Jayant-kernel force-pushed the fix-1941-improve-license-detection branch from 66bc57a to 5301c0f Compare January 29, 2026 13:32
@Jayant-kernel
Copy link
Author

@DennisClark @AyanSinhaMahapatra @JonoYang

please review the pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nixpkgs-clarity: Improve and debug ScanCode license detection based on bugs and inaccuracies found in Nixpkgs

1 participant