From a204fcc827219e8e74e8d51b26ef8e2fccf8db55 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 8 Dec 2025 13:45:56 +0100 Subject: [PATCH 1/6] feat: :sparkles: post on publishing `check-datapackage` --- posts/published-check-datapackage/index.qmd | 140 ++++++++++++++++++++ 1 file changed, 140 insertions(+) create mode 100644 posts/published-check-datapackage/index.qmd diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd new file mode 100644 index 0000000..bf6dd9c --- /dev/null +++ b/posts/published-check-datapackage/index.qmd @@ -0,0 +1,140 @@ +--- +title: "First published release of `check-datapackage`!" +description: "We've published our second Python package. :tada: :grin: This package checks that a Data Package is compliant with its specification." +author: +- Luke W. Johnston +date: "2025-12-08" +categories: + - packaging + - publishing + - programming +--- + +On November 27th, 2025, we published our second Python package to +[PyPI](https://pypi.org/project/check-datapackage). This package forms +the basis for ensuring that any metadata we create or edit for a [Data +Package](https://decisions.seedcase-project.org/why-frictionless-data/) +is correct and compliant with the [Data Package +standard](https://datapackage.org). And since we are and will be working +with and managing many Data Packages over the coming years, this is an +important tool for us to have! + +## What's `check-datapackage`? + +As with all our packages and software tools, we have a dedicated website +for +[`check-datapackage`](https://check-datapackage.seedcase-project.org). +So, rather than repeat what is already in that website, this post gives +a very quick overview of what it is and why you might want to use it. It +can be summarised by its tagline: + +> Ensure the compliance of your Data Package metadata + +The "only" thing it does is checks the content of a `datapackage.json` +file against the standard. Nothing fancy. But we designed it to be +configurable, so that if you have specific needs for your Data Package, +you can adjust the checks accordingly. For example, if you want to +ensure that certain fields are always present in the metadata, you can +set up the checks to enforce that. + +For now, `check-datapackage` is only a few Python functions and classes +that you can use within your own Python scripts. But in the future, we +plan to develop a command-line interface (CLI) so that you can use it +directly from your terminal without needing to write any code. Along +with including a config file, we hope to incorporate `check-datapackage` +into typical build tools or automated check workflows. + +## Why use it? + +We wanted this package to be incredibly simple and focused in its scope. +If you install or use it, you know exactly what it does. It also doesn't +include extra dependencies or features that you might not need. We +wanted it lightweight and easy to use. + +While there are a few tools that provide some type of checks of Data +Packages, such as the +[frictionless-py](https://pypi.org/project/frictionless/) package, we +didn't want all the extras that came with these packages. Nor are these +tools easy to configure for our needs. In this regard, there were no +tools available that fit ours needs. So we built our own package that +does exactly what we need. And hopefully it might be useful for you too! + +Eventually, when we develop `check-datapackage` as a CLI, you could +include it as a [pre-commit hook](https://pre-commit.com/) or part of +your [continuous +integration](https://docs.github.com/en/actions/automating-builds-and-tests/about-continuous-integration) +workflow so that every time you make changes to your Data Package +metadata, it is automatically checked for compliance. That way, you will +always know that everything is good with your Data Package metadata. At +least, good according to the standard and your specific needs! + +### Example use + +We have a detailed +[guide](https://check-datapackage.seedcase-project.org/docs/guide/) on +how to use `check-datapackage`. But I'll briefly show how you might use +`check-datapackage`. The main function you would use is `check()`, which +takes as input the properties of a Data Package (i.e., the contents of +the `datapackage.json` file) as a Python dictionary. + +``` python +import check_datapackage as cdp + +# Normally you'd read in the `datapackage.json` file, but we'll +# show the actual contents here as a Python dict. +properties = { + "name": "woolly-dormice", + "id": "123-abc-123", + "resources": [{ + "name": "woolly-dormice-2015", + "path": "data.csv", + "schema": {"fields": [{ + "name": "eye-colour", + "type": "string", + }]}, + }], +} + +cdp.check(properties) +``` + +At a minimum, a Data Package needs to have a `resources` property. So in +this case, there are no issues with the Data Package. But if you were to +remove the `resources` property, which is required, and run the check +again, there would be an issue: + +``` python +del properties["resources"] +cdp.check(properties) +``` + +If you want these checks to be treated as an error, you set the +parameter `error` to `True`: + +``` python +cdp.check(properties, error=True) +``` + +If you wanted to exclude certain checks, you can do that by using the +`Config` and `Exclusion` classes. For example, if you wanted to ignore +all required checks, you could do: + +``` python +exclusion_required = cdp.Exclusion(type="required") +config = cdp.Config(exclusions=[exclusion_required]) +cdp.check(properties=package_properties, config=config) +``` + +If you wanted the issues listed in a more human-friendly way, we have +the `explain()` function that takes the list of issues returned by +`check()` and formats them nicely: + +``` python +issues = cdp.check(properties) +cdp.explain(issues) +``` + +There's many other things you can configure in `check-datapackage`, so +be sure to check out the +[website](https://check-datapackage.seedcase-project.org) for more +information! From 9dcffef84bcce2db5d54dd56ab728d79e556f458 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 8 Dec 2025 16:41:44 +0100 Subject: [PATCH 2/6] fix: :pencil2: edits from review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Signe Kirk Brødbæk --- posts/published-check-datapackage/index.qmd | 43 +++++++++++---------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd index bf6dd9c..b6756da 100644 --- a/posts/published-check-datapackage/index.qmd +++ b/posts/published-check-datapackage/index.qmd @@ -12,12 +12,12 @@ categories: On November 27th, 2025, we published our second Python package to [PyPI](https://pypi.org/project/check-datapackage). This package forms -the basis for ensuring that any metadata we create or edit for a [Data +the basis for ensuring that any metadata created or edited for a [Data Package](https://decisions.seedcase-project.org/why-frictionless-data/) is correct and compliant with the [Data Package -standard](https://datapackage.org). And since we are and will be working +standard](https://datapackage.org). Since we are and will be working with and managing many Data Packages over the coming years, this is an -important tool for us to have! +important tool for us to have! Generally, this will be a helpful tool for anyone working with and managing Data Packages. ## What's `check-datapackage`? @@ -25,16 +25,16 @@ As with all our packages and software tools, we have a dedicated website for [`check-datapackage`](https://check-datapackage.seedcase-project.org). So, rather than repeat what is already in that website, this post gives -a very quick overview of what it is and why you might want to use it. It +a very quick overview of what this package does and why you might want to use it. It can be summarised by its tagline: > Ensure the compliance of your Data Package metadata -The "only" thing it does is checks the content of a `datapackage.json` +The "only" thing `check-datapackage` does is to check the content of a `datapackage.json` file against the standard. Nothing fancy. But we designed it to be configurable, so that if you have specific needs for your Data Package, -you can adjust the checks accordingly. For example, if you want to -ensure that certain fields are always present in the metadata, you can +you can adjust the checks accordingly. It's possible to both add checks on top of the standard or ignore certain checks from the standard. For example, if you want to +ensure that certain fields that aren't required by the standard are always present in the metadata, you can set up the checks to enforce that. For now, `check-datapackage` is only a few Python functions and classes @@ -42,11 +42,11 @@ that you can use within your own Python scripts. But in the future, we plan to develop a command-line interface (CLI) so that you can use it directly from your terminal without needing to write any code. Along with including a config file, we hope to incorporate `check-datapackage` -into typical build tools or automated check workflows. +into typical build tools and automated check workflows. ## Why use it? -We wanted this package to be incredibly simple and focused in its scope. +We wanted this package to be incredibly simple and focused. If you install or use it, you know exactly what it does. It also doesn't include extra dependencies or features that you might not need. We wanted it lightweight and easy to use. @@ -56,8 +56,8 @@ Packages, such as the [frictionless-py](https://pypi.org/project/frictionless/) package, we didn't want all the extras that came with these packages. Nor are these tools easy to configure for our needs. In this regard, there were no -tools available that fit ours needs. So we built our own package that -does exactly what we need. And hopefully it might be useful for you too! +tools available that fit ours needs. So, we built our own package that +does exactly what we need. Hopefully, it will be useful for other people too! Eventually, when we develop `check-datapackage` as a CLI, you could include it as a [pre-commit hook](https://pre-commit.com/) or part of @@ -65,23 +65,24 @@ your [continuous integration](https://docs.github.com/en/actions/automating-builds-and-tests/about-continuous-integration) workflow so that every time you make changes to your Data Package metadata, it is automatically checked for compliance. That way, you will -always know that everything is good with your Data Package metadata. At -least, good according to the standard and your specific needs! +always know that your Data Package metadata lives up to the standard and +your configuration. ### Example use We have a detailed [guide](https://check-datapackage.seedcase-project.org/docs/guide/) on how to use `check-datapackage`. But I'll briefly show how you might use -`check-datapackage`. The main function you would use is `check()`, which +`check-datapackage`. The main function of the package is `check()`, which takes as input the properties of a Data Package (i.e., the contents of -the `datapackage.json` file) as a Python dictionary. +the `datapackage.json` file) as a Python dictionary and checks it against the standard. ``` python import check_datapackage as cdp # Normally you'd read in the `datapackage.json` file, but we'll -# show the actual contents here as a Python dict. +# show the actual contents here as a Python dict. Can use +# the `read_json()` helper function to read in `datapackage.json` properties = { "name": "woolly-dormice", "id": "123-abc-123", @@ -115,9 +116,9 @@ parameter `error` to `True`: cdp.check(properties, error=True) ``` -If you wanted to exclude certain checks, you can do that by using the -`Config` and `Exclusion` classes. For example, if you wanted to ignore -all required checks, you could do: +If you want to exclude certain checks, you can do that by using the +`Config` and `Exclusion` classes. For example, if you want to exclude +all required checks, you can define the exclusion, add it to the configuration, and pass it to the check function like so: ``` python exclusion_required = cdp.Exclusion(type="required") @@ -125,7 +126,7 @@ config = cdp.Config(exclusions=[exclusion_required]) cdp.check(properties=package_properties, config=config) ``` -If you wanted the issues listed in a more human-friendly way, we have +If you want the issues listed in a more human-friendly way, you can use the `explain()` function that takes the list of issues returned by `check()` and formats them nicely: @@ -134,7 +135,7 @@ issues = cdp.check(properties) cdp.explain(issues) ``` -There's many other things you can configure in `check-datapackage`, so +There's many other checks you can configure with `check-datapackage`, so be sure to check out the [website](https://check-datapackage.seedcase-project.org) for more information! From caaaf39506bd8b1de10039dc44c51160539e8ebb Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 8 Dec 2025 16:42:45 +0100 Subject: [PATCH 3/6] style: :art: reformat Markdown --- posts/published-check-datapackage/index.qmd | 38 ++++++++++++--------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd index b6756da..ca479b2 100644 --- a/posts/published-check-datapackage/index.qmd +++ b/posts/published-check-datapackage/index.qmd @@ -17,7 +17,8 @@ Package](https://decisions.seedcase-project.org/why-frictionless-data/) is correct and compliant with the [Data Package standard](https://datapackage.org). Since we are and will be working with and managing many Data Packages over the coming years, this is an -important tool for us to have! Generally, this will be a helpful tool for anyone working with and managing Data Packages. +important tool for us to have! Generally, this will be a helpful tool +for anyone working with and managing Data Packages. ## What's `check-datapackage`? @@ -25,17 +26,19 @@ As with all our packages and software tools, we have a dedicated website for [`check-datapackage`](https://check-datapackage.seedcase-project.org). So, rather than repeat what is already in that website, this post gives -a very quick overview of what this package does and why you might want to use it. It -can be summarised by its tagline: +a very quick overview of what this package does and why you might want +to use it. It can be summarised by its tagline: > Ensure the compliance of your Data Package metadata -The "only" thing `check-datapackage` does is to check the content of a `datapackage.json` -file against the standard. Nothing fancy. But we designed it to be -configurable, so that if you have specific needs for your Data Package, -you can adjust the checks accordingly. It's possible to both add checks on top of the standard or ignore certain checks from the standard. For example, if you want to -ensure that certain fields that aren't required by the standard are always present in the metadata, you can -set up the checks to enforce that. +The "only" thing `check-datapackage` does is to check the content of a +`datapackage.json` file against the standard. Nothing fancy. But we +designed it to be configurable, so that if you have specific needs for +your Data Package, you can adjust the checks accordingly. It's possible +to both add checks on top of the standard or ignore certain checks from +the standard. For example, if you want to ensure that certain fields +that aren't required by the standard are always present in the metadata, +you can set up the checks to enforce that. For now, `check-datapackage` is only a few Python functions and classes that you can use within your own Python scripts. But in the future, we @@ -46,8 +49,8 @@ into typical build tools and automated check workflows. ## Why use it? -We wanted this package to be incredibly simple and focused. -If you install or use it, you know exactly what it does. It also doesn't +We wanted this package to be incredibly simple and focused. If you +install or use it, you know exactly what it does. It also doesn't include extra dependencies or features that you might not need. We wanted it lightweight and easy to use. @@ -57,7 +60,8 @@ Packages, such as the didn't want all the extras that came with these packages. Nor are these tools easy to configure for our needs. In this regard, there were no tools available that fit ours needs. So, we built our own package that -does exactly what we need. Hopefully, it will be useful for other people too! +does exactly what we need. Hopefully, it will be useful for other people +too! Eventually, when we develop `check-datapackage` as a CLI, you could include it as a [pre-commit hook](https://pre-commit.com/) or part of @@ -73,9 +77,10 @@ your configuration. We have a detailed [guide](https://check-datapackage.seedcase-project.org/docs/guide/) on how to use `check-datapackage`. But I'll briefly show how you might use -`check-datapackage`. The main function of the package is `check()`, which -takes as input the properties of a Data Package (i.e., the contents of -the `datapackage.json` file) as a Python dictionary and checks it against the standard. +`check-datapackage`. The main function of the package is `check()`, +which takes as input the properties of a Data Package (i.e., the +contents of the `datapackage.json` file) as a Python dictionary and +checks it against the standard. ``` python import check_datapackage as cdp @@ -118,7 +123,8 @@ cdp.check(properties, error=True) If you want to exclude certain checks, you can do that by using the `Config` and `Exclusion` classes. For example, if you want to exclude -all required checks, you can define the exclusion, add it to the configuration, and pass it to the check function like so: +all required checks, you can define the exclusion, add it to the +configuration, and pass it to the check function like so: ``` python exclusion_required = cdp.Exclusion(type="required") From 29a4a792c06016db3d84828f36ca52717004e1b6 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Tue, 9 Dec 2025 11:20:56 +0100 Subject: [PATCH 4/6] fix: :pencil2: edits from review Co-authored-by: Joel Ostblom --- posts/published-check-datapackage/index.qmd | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd index ca479b2..9b8f836 100644 --- a/posts/published-check-datapackage/index.qmd +++ b/posts/published-check-datapackage/index.qmd @@ -32,7 +32,7 @@ to use it. It can be summarised by its tagline: > Ensure the compliance of your Data Package metadata The "only" thing `check-datapackage` does is to check the content of a -`datapackage.json` file against the standard. Nothing fancy. But we +`datapackage.json` file against the Data Package standard. Nothing fancy. But we designed it to be configurable, so that if you have specific needs for your Data Package, you can adjust the checks accordingly. It's possible to both add checks on top of the standard or ignore certain checks from @@ -49,8 +49,7 @@ into typical build tools and automated check workflows. ## Why use it? -We wanted this package to be incredibly simple and focused. If you -install or use it, you know exactly what it does. It also doesn't +We wanted this package to be incredibly simple and focused. It also doesn't include extra dependencies or features that you might not need. We wanted it lightweight and easy to use. @@ -76,7 +75,7 @@ your configuration. We have a detailed [guide](https://check-datapackage.seedcase-project.org/docs/guide/) on -how to use `check-datapackage`. But I'll briefly show how you might use +how to use `check-datapackage`. But we'll briefly show how you might use `check-datapackage`. The main function of the package is `check()`, which takes as input the properties of a Data Package (i.e., the contents of the `datapackage.json` file) as a Python dictionary and From 6c9510a2a3a9f89c4d5842f1a69ef7c0263f0edb Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Fri, 12 Dec 2025 13:34:17 +0100 Subject: [PATCH 5/6] fix: :pencil2: edits from review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Signe Kirk Brødbæk --- posts/published-check-datapackage/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd index 9b8f836..ff0eaf8 100644 --- a/posts/published-check-datapackage/index.qmd +++ b/posts/published-check-datapackage/index.qmd @@ -85,7 +85,7 @@ checks it against the standard. import check_datapackage as cdp # Normally you'd read in the `datapackage.json` file, but we'll -# show the actual contents here as a Python dict. Can use +# show the actual contents here as a Python dict. You can use # the `read_json()` helper function to read in `datapackage.json` properties = { "name": "woolly-dormice", From 8f32848d5ecd4851e910d1f3cde33753debcdf60 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Fri, 12 Dec 2025 13:36:18 +0100 Subject: [PATCH 6/6] style: :art: reformat Markdown --- posts/published-check-datapackage/index.qmd | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/posts/published-check-datapackage/index.qmd b/posts/published-check-datapackage/index.qmd index ff0eaf8..0ce3cdc 100644 --- a/posts/published-check-datapackage/index.qmd +++ b/posts/published-check-datapackage/index.qmd @@ -32,13 +32,14 @@ to use it. It can be summarised by its tagline: > Ensure the compliance of your Data Package metadata The "only" thing `check-datapackage` does is to check the content of a -`datapackage.json` file against the Data Package standard. Nothing fancy. But we -designed it to be configurable, so that if you have specific needs for -your Data Package, you can adjust the checks accordingly. It's possible -to both add checks on top of the standard or ignore certain checks from -the standard. For example, if you want to ensure that certain fields -that aren't required by the standard are always present in the metadata, -you can set up the checks to enforce that. +`datapackage.json` file against the Data Package standard. Nothing +fancy. But we designed it to be configurable, so that if you have +specific needs for your Data Package, you can adjust the checks +accordingly. It's possible to both add checks on top of the standard or +ignore certain checks from the standard. For example, if you want to +ensure that certain fields that aren't required by the standard are +always present in the metadata, you can set up the checks to enforce +that. For now, `check-datapackage` is only a few Python functions and classes that you can use within your own Python scripts. But in the future, we @@ -49,9 +50,9 @@ into typical build tools and automated check workflows. ## Why use it? -We wanted this package to be incredibly simple and focused. It also doesn't -include extra dependencies or features that you might not need. We -wanted it lightweight and easy to use. +We wanted this package to be incredibly simple and focused. It also +doesn't include extra dependencies or features that you might not need. +We wanted it lightweight and easy to use. While there are a few tools that provide some type of checks of Data Packages, such as the