Skip to content

Commit d214084

Browse files
authored
Merge branch 'main' into dependabot/pip/urllib3-2.5.0
2 parents 4d50ad6 + b3928f1 commit d214084

197 files changed

Lines changed: 23703 additions & 24883 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
name: Build aboutcode.federated Python distributions and publish on PyPI
2+
3+
on:
4+
workflow_dispatch:
5+
push:
6+
tags:
7+
- "aboutcode.federated/*"
8+
9+
jobs:
10+
build-and-publish:
11+
name: Build and publish library to PyPI
12+
runs-on: ubuntu-22.04
13+
14+
steps:
15+
- uses: actions/checkout@v4
16+
17+
- name: Set up Python
18+
uses: actions/setup-python@v5
19+
with:
20+
python-version: 3.11
21+
22+
- name: Install flot
23+
run: python -m pip install flot --user
24+
25+
- name: Build binary wheel and source tarball
26+
run: python -m flot --pyproject pyproject-aboutcode.federated.toml --sdist --wheel --output-dir dist/
27+
28+
- name: Publish to PyPI
29+
if: startsWith(github.ref, 'refs/tags')
30+
uses: pypa/gh-action-pypi-publish@release/v1
31+
with:
32+
password: ${{ secrets.PYPI_API_TOKEN_ABOUTCODE_FEDERATED }}

README.rst

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@ VulnerableCode
1919

2020

2121
VulnerableCode is a free and open database of open source software package
22-
vulnerabilities **because open source software vulnerabilities data and tools
22+
vulnerabilities **because open source software vulnerability data and tools
2323
should be free and open source themselves**:
2424

25-
we are trying to change this and evolve the status quo in a few other areas!
25+
We are trying to change this and evolve the status quo in a few other areas!
2626

2727
- Vulnerability databases have been **traditionally proprietary** even though they
2828
are mostly about free and open source software.
@@ -31,13 +31,13 @@ we are trying to change this and evolve the status quo in a few other areas!
3131
means a lot of false positive signals that require extensive expert reviews.
3232

3333
- Vulnerability databases are also mostly about vulnerabilities first and software
34-
package second, making it difficult to find if and when a vulnerability applies
35-
to a piece of code. VulnerableCode focus is on software package first where
36-
a Package URL is a key and natural identifier for packages; this is making it
34+
packages second, making it difficult to find if and when a vulnerability applies
35+
to a piece of code. VulnerableCode's focus is on software packages first where
36+
a Package URL (PURL) is a key and natural identifier for packages; this makes it
3737
easier to find a package and whether it is vulnerable.
3838

39-
Package URL themselves were designed first in ScanCode and VulnerableCode
40-
and are now a de-facto standard for vulnerability management and package references.
39+
PURLs were designed initially for ScanCode and VulnerableCode. PURL is
40+
now a de-facto standard for vulnerability management and package references.
4141
See https://github.com/package-url/purl-spec
4242

4343
The VulnerableCode project is a FOSS community resource to help improve the
@@ -49,17 +49,14 @@ the database current.
4949

5050
.. pull-quote::
5151
**Warning**
52+
VulnerableCode is under active development and may not be ready for production
53+
use depending on your use cases.
5254

53-
VulnerableCode is under active development and is not yet fully
54-
usable.
55+
Read more about VulnerableCode at https://vulnerablecode.readthedocs.org/
5556

56-
57-
Read more about VulnerableCode https://vulnerablecode.readthedocs.org/
58-
59-
VulnerableCode tech stack is Python, Django, PostgreSQL, nginx and Docker and
57+
The VulnerableCode tech stack is Python, Django, PostgreSQL, nginx and Docker and
6058
several libraries.
6159

62-
6360
Getting started
6461
===============
6562

aboutcode/federated/CHANGELOG.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Changelog
2+
=============
3+
4+
5+
v0.1.0 (October 20, 2025)
6+
---------------------------
7+
8+
- Initial release of the ``aboutcode.federated`` library based on
9+
original work in the ``aboutcode.hashid`` library.

aboutcode/federated/README.rst

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
aboutcode.federated
2+
===================
3+
4+
This is a library of utilities to compute ids and file paths for AboutCode
5+
federated data based on Package URL
6+
7+
8+
Federated data utilities goal is to handle content-defined and hash-addressable
9+
Package data keyed by PURL stored in many Git repositories. This approach to
10+
federate decentralized data is called FederatedCode.
11+
12+
13+
Overview
14+
========
15+
16+
The main design elements for these utilities are:
17+
18+
1. **Data Federation**: A Data Federation is a database, representing a consistent,
19+
non-overlapping set of data kind clusters (like scans, vulnerabilities or SBOMs)
20+
across many package ecosystems, aka. PURL types.
21+
A Federation is similar to a traditional database.
22+
23+
2. **Data Cluster**: A Data Federation contains Data Clusters, where a Data Cluster
24+
purpose is to store the data of a single kind (like scans) across multiple PURL
25+
types. The cluster name is the data kind name and is used as the prefix for
26+
repository names. A Data Cluster is akin to a table in a traditional database.
27+
28+
3. **Data Repository**: A DataCluster contains of one or more Git Data Repository,
29+
each storing datafiles of the cluster data kind and a one PURL type, spreading
30+
the datafiles in multiple Data Directories. The name is data-kind +PURL-
31+
type+hashid. A Repository is similar to a shard or tablespace in a traditionale
32+
database.
33+
34+
4. **Data Directory**: In a Repository, a Data Directory contains the datafiles for
35+
PURLs. The directory name PURL-type+hashid
36+
37+
5. **Data File**: This is a Data File of the DataCluster's Data Kind that is
38+
stored in subdirectories structured after the PURL components::
39+
40+
namespace/name/version/qualifiers/subpath:
41+
42+
- Either at the level of a PURL name: namespace/name,
43+
- Or at the PURL version level namespace/name/version,
44+
- Or at the PURL qualifiers+PURL subpath level.
45+
46+
A Data File can be for instance a JSON scan results file, or a list of PURLs in
47+
YAML.
48+
49+
For example, a list of PURLs as a Data Kind would stored at the name
50+
subdirectory level::
51+
52+
gem-0107/gem/random_password_generator/purls.yml
53+
54+
Or a ScanCode scan as a Data Kind at the version subdirectory level::
55+
56+
gem-0107/npm/file/3.24.3/scancode.yml
57+
58+
59+
60+
License
61+
-------
62+
63+
Copyright (c) AboutCode and others. All rights reserved.
64+
65+
SPDX-License-Identifier: Apache-2.0
66+
67+
See https://github.com/aboutcode-org/vulnerablecode for support or download.
68+
69+
See https://aboutcode.org for more information about AboutCode OSS projects.

0 commit comments

Comments
 (0)