-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi authors,
First of all, thank you very much for your outstanding work and for open-sourcing the TACMT code! This paper is very insightful.
I have recently been trying to reproduce the results reported in the paper, specifically for the SARVG dataset (e.g., Table 3 or Table 4 ).
Expected Results (from the paper)
According to Table 3 (Val Set), the performance metrics for TACMT (ours) are:
- Pr@0.5: 88.57
- mIoU: 81.59
According to Table 4 (Test Set), the performance metrics for TACMT (ours) are:
- Pr@0.5: 89.38
- mIoU: 82.81
Actual Results (My Reproduction)
In my experiments, the best result I obtained (evaluated on the val_split) is:
- Pr@0.5: 0.8647
- mIoU: 0.7847
The best result evaluated on the test_split is:
- Pr@0.5: 0.8676
- mIoU: 0.7939
As you can see, my results are about 2 percentage points lower than those reported in the paper.
Reproduction Steps
I have strictly followed the steps in the paper and the README.md:
-
Environment Installation:
torch==1.9.1+cu111torchvision==0.10.1+cu111pytorch-pretrained-bert==0.6.2rasterio==1.3.11
-
Configuration File:
- I used
configs/SARVG_R50.py. - I downloaded the
load_weights_pathspecified in the config file. - I have modified the
data_rootandsplit_rootin the config file to my local paths.
- I used
-
Training Command:
- I followed the hyperparameters described in Section 4.2 of the paper (e.g., 90 epochs total,
lr_drop=60,freeze_epochs=5,L1 loss-coef=5,GIoU loss-coef=2) . - My training launch command is as follows:
Bash
# Started with 2 GPUs python -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/SARVG_R50.py --world_size 2 --checkpoint_best --enable_batch_accum --batch_size 10 --freeze_epochs 5 - I followed the hyperparameters described in Section 4.2 of the paper (e.g., 90 epochs total,
My Environment
- PyTorch Version: 1.9.1
- CUDA Version: 11.1
- GPU Model: RTX3090 * 2
- Operating System: CentOS Linux release 7.9.2009
Attachments
I have attached my full training log (from epoch 0 to 90) to this issue so you can review the detailed loss and evaluation metrics.
Question
Could you please help me check if I missed any critical settings? Or is this performance variation an expected fluctuation, possibly due to minor differences in random seeds?
Thank you very much for any help!