Skip to content

open-contracting/notebooks-ocds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

435 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCDS Notebooks

A collection of Jupyter notebooks for working with data from:

Notebooks

To use a notebook:

  • Click the Open In Colab button
  • Click the File > Save a copy in Drive menu item
  • Make your changes (e.g. collection_ids, schema_name, etc.)

If you encounter unfamiliar errors, try the Runtime > Disconnect and delete runtime menu item. If the error still occurs, please open an issue.

If you make any improvements or fixes, please follow the Contributing guide below to merge your changes back into this repository.

You can also use a notebook without creating a copy. However, if you re-open the notebook, any changes and outputs will be lost.

Kingfisher Process

Notebook Open in Colab Description
Publisher analysis template Open in Colab Analyze data from a specific publisher.
Meta analysis template Open in Colab Analyze data from multiple publishers, or to perform other types of analysis on the Kingfisher Process database.
Basic criteria feedback template Open in Colab Provide feedback on the OCDS basic criteria.
Structure and format feedback template Open in Colab Provide feedback on structure and format errors reported by lib-cove-ocds.
Data quality feedback template Open in Colab Provide detailed feedback on structure, format, conformance and quality issues.
Usability checks template Open in Colab Provide feedback on data usability for OCDS datasets.
Red flags checks template Open in Colab Provide feedback on red flags for OCDS datasets.

Other data sources

Notebook Open in Colab Description
Usability checks using a field list Open in Colab Provide feedback on data usability for prospective OCDS publishers, using a field list, like from a field-level mapping.
Usability checks using the Data Registry Open Iinn Colab Provide feedback on data usability using data from the Data Registry.
Relevant checks using a field list Open in Colab Provide feedback on data relevance for prospective publishers, using a field list, like from a field-level mapping.
Relevant checks using the Data Registry Open Iinn Colab Provide feedback on data relevance using data from the Data Registry.
Relevant checks for all the Data Registry publications Open Iinn Colab Provide feedback on data relevance downloading all the publications from the Data Registry.
Red flags checks using the Data Registry Open Iinn Colab Provide feedback on coverage for red flags using data from the Data Registry.
Red flags checks using a field list Open Iinn Colab Provide feedback on red flags for prospective OCDS publishers, using a field list, like from a field-level mapping.
Field list for all the Data Registry publications Open Iinn Colab Extract the fields published by all the publications from the Data Registry.

Contributing

Components

To ease maintenance, the notebooks are made up of reusable components with clear scopes:

  • Environment: Setup Google Colaboratory in general
    • environment: Install requirements, import packages, load extensions and configure the notebook.
  • Setup: Setup Google Colaboratory for a data source
    • setup_charts: Install charts requirements, import charts packages and define plot functions.
    • setup_kingfisher: Connect to the Kingfisher Process database. Choose the collection(s) and schema to work with.
    • setup_fieldlist: Load the field list.
    • setup_metadata_from_registry: Define the functions to list publications and their metadata, including coverage, from the Registry.
    • setup_usability: Define the usability functions.
    • setup_red_flags: Define the red flags functions.
  • Errors: Review any issues in loading the data
    • errors_kingfisher: Check for data collection (Kingfisher Collect) and processing (Kingfisher Process) errors.
  • Scope: Understand the scope of the data
    • scope_kingfisher: Check how many releases and records your data contains. Check the date range and stages of the contracting process covered by your data.
    • scope_usability: Calculate general statistics.
  • Check: Perform a category of checks
    • check_structure: Check for structure and format errors reported by lib-cove-ocds.
    • check_conformance: Check against the OCDS conformance criteria.
    • check_quality: Check for conformance and quality issues that require manual review.
    • check_usability_kingfisher: Usability checks using Kingfisher with coverage.
    • check_usability_external: Usability checks using a field list without coverage.
    • check_relevant: Given a field list, check if the list pass the "relevant" criteria.
    • check_relevant_all_registry: Performs the "relevant" checks against the active publications from the Registry.
    • check_red_flags_external: Red flags checks using a field list without coverage.
  • Other
    • select_data_from_registry: Define the form to select a publication from the Registry.
    • get_field_list_all_registry: Get the fields used by all OCDS publications in the Registry.

Quick reference

Follow the style guide for SQL statements.

  • To see which components are used in each notebook, refer to the NOTEBOOKS variable in manage.py.
  • To add new components to a notebook, add to the entry for the notebook in the NOTEBOOKS variable in manage.py.
  • To add a new notebook:
    • Add an entry for the the notebook and its components to the NOTEBOOKS variable in manage.py.
    • Update the Notebooks section of the README.md.

Add a component using Colab

  1. Create a branch.
  2. Create a new notebook
  3. Set a title using H2 formatting, and add your cells.
  4. Commit your changes:
    • Click Edit -> Clear all outputs.
    • Click File -> Save.
    • Select the 'notebooks-ocds' repository.
    • Select your branch, enter a commit message and click OK.
    • Uncheck 'Include a link to Colab'
  5. Request a review:
    • Create a pull request.
    • Request a review from a data support manager.
    • If the reviewer requests changes, make the changes then repeat this step.
  6. Once approved, you can merge your own changes.

Add or edit a component using an editor

Reminder: If you change headings or add sections, check whether any related Document template in this process note needs an update.

Jupytext is used to encode notebooks as Markdown files (if code cells are mostly SQL) or Python files.

Python files use the light format. For example:

# ## A heading
#
# A paragraph

python_code = "cell"
second_line = "code"

another_code_cell = True

To merge code cells, use start-of-cell and end-of-cell delimiters:

# +
code = "cell"

same = "cell"
# -

To hide code:

# +
# @title My title { display-mode: "form" }

python = "code"
# -

The end-of-cell delimiter is optional if the next cell is also hidden, or at the end of the file.

To add SQL:

# + language="sql"
# SELECT 1
# -

Or:

# + magic_args="my_variable <<" language="sql"
# SELECT 1
# -

About

A collection of notebooks for analysing OCDS data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors