Skip to content

UCREL/USAS-Evaluation-Framework

Repository files navigation

USAS-Evaluation-Framework

Evaluation metrics and datasets for USAS Semantic Tagging

Setup

You can either use the dev container with your favourite editor, e.g. VSCode. Or you can create your setup locally below we demonstrate both.

In both cases they share the same tools, of which these tools are:

  • uv for Python packaging and development
  • make (OPTIONAL) for automation of tasks, not strictly required but makes life easier.

Dev Container

A dev container uses a docker container to create the required development environment, the Dockerfile we use for this dev container can be found at ./.devcontainer/Dockerfile. To run it locally it requires docker to be installed, you can also run it in a cloud based code editor, for a list of supported editors/cloud editors see the following webpage.

To run for the first time on a local VSCode editor (a slightly more detailed and better guide on the VSCode website):

  1. Ensure docker is running.
  2. Ensure the VSCode Dev Containers extension is installed in your VSCode editor.
  3. Open the command pallete CMD + SHIFT + P and then select Dev Containers: Rebuild and Reopen in Container

You should now have everything you need to develop, uv, make, for VSCode various extensions like Pylance, etc.

If you have any trouble see the VSCode website..

Local

To run locally first ensure you have the following tools installted locally:

  • uv for Python packaging and development. (version 0.9.6)
  • make (OPTIONAL) for automation of tasks, not strictly required but makes life easier.
    • Ubuntu: apt-get install make
    • Mac: Xcode command line tools includes make else you can use brew.
    • Windows: Various solutions proposed in this blog post on how to install on Windows, inclduing Cygwin, and Windows Subsystem for Linux.

When developing on the project you will want to install the Python package locally in editable format with all the extra requirements, this can be done like so:

uv sync

Linting

Linting and formatting with ruff it is a replacement for tools like Flake8, isort, Black etc, and we us ty for type checking.

To run the linting:

make lint

Tests

To run the tests (uses pytest and coverage) and generate a coverage report:

make test

To test the parsing of the Irish ICC dataset, i.e. to fully test the usas_evaluation_framework.parsers.icc_irish.ICCIrishParser.parse method fully it requires downloading the Irish ICC human annotated dataset files too: tests/data/parsers/icc_irish, e.g. tests/data/parsers/icc_irish/ICC-GA-WPH-001-the_wire.tsv.

License

The code is licensed under Apache License Version 2.0.

The following data files, that we use for testing, are licensed under Creative Commons Attribution Non Commercial Share Alike 4.0;

About

Evaluation metrics and datasets for USAS Semantic Tagging

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages