OCDS Notebooks

A collection of Jupyter notebooks for working with data from:

Kingfisher Process
Data Registry
Field lists, like from a field-level mappings

Notebooks

To use a notebook:

Click the Open In Colab button
Click the File > Save a copy in Drive menu item
Make your changes (e.g. collection_ids, schema_name, etc.)

If you encounter unfamiliar errors, try the Runtime > Disconnect and delete runtime menu item. If the error still occurs, please open an issue.

If you make any improvements or fixes, please follow the Contributing guide below to merge your changes back into this repository.

You can also use a notebook without creating a copy. However, if you re-open the notebook, any changes and outputs will be lost.

Kingfisher Process

Notebook	Open in Colab	Description
Publisher analysis template		Analyze data from a specific publisher.
Meta analysis template		Analyze data from multiple publishers, or to perform other types of analysis on the Kingfisher Process database.
Basic criteria feedback template		Provide feedback on the OCDS basic criteria.
Structure and format feedback template		Provide feedback on structure and format errors reported by lib-cove-ocds.
Data quality feedback template		Provide detailed feedback on structure, format, conformance and quality issues.
Usability checks template		Provide feedback on data usability for OCDS datasets.
Red flags checks template		Provide feedback on red flags for OCDS datasets.

Other data sources

Notebook	Open in Colab	Description
Usability checks using a field list		Provide feedback on data usability for prospective OCDS publishers, using a field list, like from a field-level mapping.
Usability checks using the Data Registry		Provide feedback on data usability using data from the Data Registry.
Relevant checks using a field list		Provide feedback on data relevance for prospective publishers, using a field list, like from a field-level mapping.
Relevant checks using the Data Registry		Provide feedback on data relevance using data from the Data Registry.
Relevant checks for all the Data Registry publications		Provide feedback on data relevance downloading all the publications from the Data Registry.
Red flags checks using the Data Registry		Provide feedback on coverage for red flags using data from the Data Registry.
Red flags checks using a field list		Provide feedback on red flags for prospective OCDS publishers, using a field list, like from a field-level mapping.
Field list for all the Data Registry publications		Extract the fields published by all the publications from the Data Registry.

Contributing

Components

To ease maintenance, the notebooks are made up of reusable components with clear scopes:

Environment: Setup Google Colaboratory in general
- environment: Install requirements, import packages, load extensions and configure the notebook.
Setup: Setup Google Colaboratory for a data source
- setup_charts: Install charts requirements, import charts packages and define plot functions.
- setup_kingfisher: Connect to the Kingfisher Process database. Choose the collection(s) and schema to work with.
- setup_fieldlist: Load the field list.
- setup_metadata_from_registry: Define the functions to list publications and their metadata, including coverage, from the Registry.
- setup_usability: Define the usability functions.
- setup_red_flags: Define the red flags functions.
Errors: Review any issues in loading the data
- errors_kingfisher: Check for data collection (Kingfisher Collect) and processing (Kingfisher Process) errors.
Scope: Understand the scope of the data
- scope_kingfisher: Check how many releases and records your data contains. Check the date range and stages of the contracting process covered by your data.
- scope_usability: Calculate general statistics.
Check: Perform a category of checks
- check_structure: Check for structure and format errors reported by lib-cove-ocds.
- check_conformance: Check against the OCDS conformance criteria.
- check_quality: Check for conformance and quality issues that require manual review.
- check_usability_kingfisher: Usability checks using Kingfisher with coverage.
- check_usability_external: Usability checks using a field list without coverage.
- check_relevant: Given a field list, check if the list pass the "relevant" criteria.
- check_relevant_all_registry: Performs the "relevant" checks against the active publications from the Registry.
- check_red_flags_external: Red flags checks using a field list without coverage.
Other
- select_data_from_registry: Define the form to select a publication from the Registry.
- get_field_list_all_registry: Get the fields used by all OCDS publications in the Registry.

Quick reference

Follow the style guide for SQL statements.

To see which components are used in each notebook, refer to the NOTEBOOKS variable in manage.py.
To add new components to a notebook, add to the entry for the notebook in the NOTEBOOKS variable in manage.py.
To add a new notebook:
- Add an entry for the the notebook and its components to the NOTEBOOKS variable in manage.py.
- Update the Notebooks section of the README.md.

Add a component using Colab

Create a branch.
Create a new notebook
Set a title using H2 formatting, and add your cells.
Commit your changes:
- Click Edit -> Clear all outputs.
- Click File -> Save.
- Select the 'notebooks-ocds' repository.
- Select your branch, enter a commit message and click OK.
- Uncheck 'Include a link to Colab'
Request a review:
- Create a pull request.
- Request a review from a data support manager.
- If the reviewer requests changes, make the changes then repeat this step.
Once approved, you can merge your own changes.

Add or edit a component using an editor

Reminder: If you change headings or add sections, check whether any related Document template in this process note needs an update.

Jupytext is used to encode notebooks as Markdown files (if code cells are mostly SQL) or Python files.

Python files use the light format. For example:

# ## A heading
#
# A paragraph

python_code = "cell"
second_line = "code"

another_code_cell = True

To merge code cells, use start-of-cell and end-of-cell delimiters:

# +
code = "cell"

same = "cell"
# -

To hide code:

# +
# @title My title { display-mode: "form" }

python = "code"
# -

The end-of-cell delimiter is optional if the next cell is also hidden, or at the end of the file.

To add SQL:

# + language="sql"
# SELECT 1
# -

Or:

# + magic_args="my_variable <<" language="sql"
# SELECT 1
# -

Name		Name	Last commit message	Last commit date
Latest commit History 435 Commits
.github		.github
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
component_check_conformance.md		component_check_conformance.md
component_check_quality.md		component_check_quality.md
component_check_red_flags_external.py		component_check_red_flags_external.py
component_check_red_flags_kingfisher.py		component_check_red_flags_kingfisher.py
component_check_relevant.py		component_check_relevant.py
component_check_relevant_all_registry.py		component_check_relevant_all_registry.py
component_check_structure.md		component_check_structure.md
component_check_usability_external.py		component_check_usability_external.py
component_check_usability_kingfisher.py		component_check_usability_kingfisher.py
component_environment.py		component_environment.py
component_errors_kingfisher.md		component_errors_kingfisher.md
component_get_field_list_all_registry.py		component_get_field_list_all_registry.py
component_scope_kingfisher.md		component_scope_kingfisher.md
component_scope_usability.md		component_scope_usability.md
component_select_data_from_registry.py		component_select_data_from_registry.py
component_setup_charts.py		component_setup_charts.py
component_setup_fieldlist.py		component_setup_fieldlist.py
component_setup_kingfisher.py		component_setup_kingfisher.py
component_setup_metadata_from_registry.py		component_setup_metadata_from_registry.py
component_setup_red_flags.py		component_setup_red_flags.py
component_setup_usability.py		component_setup_usability.py
manage.py		manage.py
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.txt		requirements.txt
requirements_dev.in		requirements_dev.in
requirements_dev.txt		requirements_dev.txt
template_basic_criteria_checks.ipynb		template_basic_criteria_checks.ipynb
template_data_quality_feedback.ipynb		template_data_quality_feedback.ipynb
template_field_list_registry_all.ipynb		template_field_list_registry_all.ipynb
template_meta_analysis.ipynb		template_meta_analysis.ipynb
template_publisher_analysis.ipynb		template_publisher_analysis.ipynb
template_red_flags_checks.ipynb		template_red_flags_checks.ipynb
template_red_flags_checks_fieldlist.ipynb		template_red_flags_checks_fieldlist.ipynb
template_red_flags_checks_registry.ipynb		template_red_flags_checks_registry.ipynb
template_relevant_checks_fieldlist.ipynb		template_relevant_checks_fieldlist.ipynb
template_relevant_checks_registry.ipynb		template_relevant_checks_registry.ipynb
template_relevant_checks_registry_all.ipynb		template_relevant_checks_registry_all.ipynb
template_structure_and_format_feedback.ipynb		template_structure_and_format_feedback.ipynb
template_usability_checks.ipynb		template_usability_checks.ipynb
template_usability_checks_fieldlist.ipynb		template_usability_checks_fieldlist.ipynb
template_usability_checks_registry.ipynb		template_usability_checks_registry.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCDS Notebooks

Notebooks

Kingfisher Process

Other data sources

Contributing

Components

Quick reference

Add a component using Colab

Add or edit a component using an editor

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OCDS Notebooks

Notebooks

Kingfisher Process

Other data sources

Contributing

Components

Quick reference

Add a component using Colab

Add or edit a component using an editor

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages