Repository for managing RDF/SKOS vocabularies used in the NKR/CCMM ecosystem.
This repository provides:
- RDF/SKOS vocabulary storage
- Multi-format serialization support
- Validation workflows
- SKOS normalization/fixing workflows
- GitHub Actions automation
Supported formats:
- Turtle (
.ttl) - RDF/XML (
.rdf) - JSON-LD (
.jsonld) - N-Triples (
.nt) - Source XML (
.xml) for conversion sources only
.
├── .github/
│ └── workflows/
│ ├── validate-skos.yml
│ ├── fix-skos.yml
│ └── convert-vocabularies.yml
│
├── scripts/
│ ├── validate_skos.sh
│ ├── fix_skos.sh
│ └── convert_vocabularies.py
│
├── vocabularies/
│ ├── access_rights/
│ ├── resource_types/
│ ├── languages-skos/
│ ├── filetypes-skos/
│ └── ...
│
└── README.md
Each vocabulary folder should contain equivalent serializations of the same vocabulary.
Example:
resource_types/
├── resource_types.rdf
├── resource_types.ttl
├── resource_types.jsonld
The conversion script uses priority-based source selection.
Priority order:
.rdf.ttl.jsonld.nt.xml
Meaning:
- If
.rdfexists, it becomes the canonical source - Otherwise
.ttl - Otherwise
.jsonld - Otherwise
.nt - Otherwise
.xml
Generated formats:
| Source | Generated |
|---|---|
.rdf |
.ttl, .jsonld |
.ttl |
.rdf, .jsonld |
.jsonld |
.rdf, .ttl |
.nt |
.rdf, .ttl, .jsonld |
.xml |
.rdf, .ttl, .jsonld |
Script:
scripts/validate_skos.sh
Purpose:
- RDF syntax validation using Apache Jena
riot - SKOS validation using
skosify
Supported validation formats:
.ttl.rdf.jsonld.json-ld.nt
Regular XML source files are NOT validated.
Validate all vocabularies:
./scripts/validate_skos.sh vocabulariesValidate one vocabulary:
./scripts/validate_skos.sh vocabularies/resource_typesScript:
scripts/fix_skos.sh
Purpose:
- Normalize SKOS vocabularies
- Remove redundant hierarchy relations
- Clean labels
- Apply skosify cleanup rules
./scripts/fix_skos.sh vocabulariesAfter fixing:
./scripts/validate_skos.sh vocabulariesScript:
scripts/convert_vocabularies.py
Purpose:
- Convert vocabularies between RDF serializations
- Auto-select canonical source format by priority
- Generate missing formats
Convert all vocabularies:
python scripts/convert_vocabularies.py vocabularies --overwriteConvert one vocabulary:
python scripts/convert_vocabularies.py vocabularies/resource_types --overwriteWorkflow:
.github/workflows/validate-skos.yml
Run:
GitHub → Actions → Validate SKOS → Run workflow
Workflow:
.github/workflows/fix-skos.yml
Run:
GitHub → Actions → Fix SKOS → Run workflow
Workflow:
.github/workflows/convert-vocabularies.yml
Run:
GitHub → Actions → Convert Vocabularies → Run workflow
There are two XML cases.
Supported RDF serialization:
file.rdf
Handled by:
- validation
- fixing
- conversion
Example:
iana-media-types.xml
These are NOT RDF/XML.
They are source data and require dedicated XML → SKOS conversion logic.
They are intentionally excluded from validation/fixing workflows.