Skip to content

Repository with basic machine learning algorithms implemented in PyTorch.

Notifications You must be signed in to change notification settings

matteomalucchi/ML_pytorch

Repository files navigation

ML_pytorch

Repository with basic machine learning algorithms implemented in PyTorch.

The coffea files used as inputs are based on the output of PocketCoffea. In particular, the framework was developed based on the output of the AnalysisConfigs repository, which is a collection of analysis configurations for the PocketCoffea framework.

Installation

To create the micromamba environment, you can use the following command:

salloc --account gpu_gres --job-name "InteractiveJob" --cpus-per-task 4 --mem-per-cpu 3000 --time 01:00:00  -p gpu --gres=gpu:1
micromamba env create -f ML_pytorch_env.yml
micromamba activate ML_pytorch
pip install -r requirements.txt
# install the package in editable mode
pip install -e .

Connect to node with a gpu

To connect to a node with a gpu, you can use the following command:

# connect to a node with a gpu
salloc --account gpu_gres --job-name "InteractiveJob" --cpus-per-task 4 --mem-per-cpu 3000 --time 01:00:00  -p gpu --gres=gpu:1
# activate the environment
micromamba activate ML_pytorch
# check which gpu is available
echo $CUDA_VISIBLE_DEVICES # or echo $SLURM_JOB_GPUS

Examples

To execute an example training, evaluate the model on the test set, plot the history and plot the signal/background histograms, you can use the following command:

ml_train  -c configs/example_DNN_config_ggF_VBF.yml

Training on a cluster with slurm

To execute either a 20x training for background reweighting or to run a sig_bkg_classifier model, there are two scripts that can be run with slurm:

# Outside of any node activate your environment (e.g. `micromamba activate ML_pytorch`)
cd jobs/
# If the output folder is not provided, it will have the same name as the config file without the extension
# For 20x training for bkg reweighting:
sbatch run_20_trainings_in_4_parallel.sh <config_file> <output_folder>
# when this has finished, you can merge the results with:
cd <output_folder>
ml_onnx -i best_models -o best_models -ar -v bkg_morphing_dnn_DeltaProb_input_variables

# For sig_bkg_reweighting
sbatch run_sig_bkg_classifier.sh <config_file> <output_folder>

To execute 5 runs in a node without the interactive access to the GPU node (the given config and folder names are just examples):

# Outside of any node activate your environment (e.g. `micromamba activate ML_pytorch`)

# Then run this command:
sbatch --account gpu_gres --job-name "InteractiveJob" --cpus-per-task 4 --mem-per-cpu 5000 --time 12:00:00  -p gpu --gres=gpu:1 --wrap=". ./run_batch_of_5.sh /work/tharte/datasets/ML_pytorch/configs/bkg_reweighting/DNN_AN_1e-3_e20drop75_minDelta1em5_SPANet_postEE.yml out/bkg_reweighting/SPANET_ptFlat_20_runs_postEE 0"

Additional scripts

The training will produce the ONNX model to be used in PocketCoffea for background morphing, as well as plots with the training history, the ROC curve and an overtraining check.

These plots can be produced using the following command:

# Plot the history of a training 
ml_history -i <training_log_file>

# Plot the ROC curve and overtraining check
ml_sb -i <training_directory>

COMET integration

Additionally, there are now options to send the metrics of the training to COMET (academics accounts are available for free): To set it up together with the files mentioned above:

# Open the file with the editor of your choice
vim jobs/comet_token.key
# in the first line write your username, and in the second line, write your token (to be retrieved on the website):
# <uname>
# <token>

The scripts will read this file if it exists and automatically sends the information to ml_pytorch

About

Repository with basic machine learning algorithms implemented in PyTorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •