Skip to content

moritz-steiner/mt-exercise-5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

MT Exercise 5: Byte Pair Encoding, Beam Search

This repository is a starting point for the 5th and final exercise. As before, fork this repo to your own account and the clone it into your prefered directory.

Requirements

  • This only works on a Unix-like system, with bash available.

  • Python 3 must be installed on your system, i.e. the command python3 must be available

  • Make sure virtualenv is installed on your system. To install, e.g.

    pip install virtualenv

Steps

Clone your fork of this repository in the desired place:

git clone https://github.com/[your-name]/mt-exercise-5

Create a new virtualenv that uses Python 3.10. Please make sure to run this command outside of any virtual Python environment:

./scripts/make_virtualenv.sh

Important: Then activate the env by executing the source command that is output by the shell script above.

Download and install required software as described in the exercise pdf.

Download data:

./download_iwslt_2017_data.sh

Before executing any further steps, you need to make the modifications described in the exercise pdf.

Train a model:

./scripts/train.sh

The training process can be interrupted at any time, and the best checkpoint will always be saved.

Evaluate a trained model with

./scripts/evaluate.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages