Reproduction results are slightly lower than reported in the paper

Hi authors,

First of all, thank you very much for your outstanding work and for open-sourcing the TACMT code! This paper is very insightful.

I have recently been trying to reproduce the results reported in the paper, specifically for the SARVG dataset (e.g., Table 3 or Table 4 ).

#### Expected Results (from the paper)

According to **Table 3 (Val Set)**, the performance metrics for TACMT (ours) are:

- **Pr@0.5**: 88.57
- **mIoU**: 81.59

According to **Table 4 (Test Set)**, the performance metrics for TACMT (ours) are:

- **Pr@0.5**: 89.38
- **mIoU**: 82.81

#### Actual Results (My Reproduction)

In my experiments, the best result I obtained (evaluated on the `val_split`) is:

- **Pr@0.5**: 0.8647
- **mIoU**: 0.7847

The best result evaluated on the `test_split` is:

- **Pr@0.5**: 0.8676
- **mIoU**: 0.7939

As you can see, my results are about **2** percentage points lower than those reported in the paper.

------

#### Reproduction Steps

I have strictly followed the steps in the paper and the `README.md`:

1. **Environment Installation:**

   - `torch==1.9.1+cu111`
   - `torchvision==0.10.1+cu111`
   - `pytorch-pretrained-bert==0.6.2`
   - `rasterio==1.3.11`

2. **Configuration File:**

   - I used `configs/SARVG_R50.py`.
   - I downloaded the `load_weights_path` specified in the config file.
   - I have modified the `data_root` and `split_root` in the config file to my local paths.

3. **Training Command:**

   - I followed the hyperparameters described in Section 4.2 of the paper (e.g., 90 epochs total, `lr_drop=60`, `freeze_epochs=5`, `L1 loss-coef=5`, `GIoU loss-coef=2`) .
   - My training launch command is as follows:

   Bash

   ```
   # Started with 2 GPUs
   python -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/SARVG_R50.py --world_size 2 --checkpoint_best --enable_batch_accum --batch_size 10 --freeze_epochs 5
   ```

#### My Environment

- **PyTorch Version**: 1.9.1
- **CUDA Version**: 11.1
- **GPU Model**: RTX3090 * 2
- **Operating System**: CentOS Linux release 7.9.2009

------

#### Attachments

I have attached my full training log (from epoch 0 to 90) to this issue so you can review the detailed loss and evaluation metrics.

[sarvg_train.log](https://github.com/user-attachments/files/23230689/sarvg_train.log)


#### Question

Could you please help me check if I missed any critical settings? Or is this performance variation an expected fluctuation, possibly due to minor differences in random seeds?

Thank you very much for any help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduction results are slightly lower than reported in the paper #2

Expected Results (from the paper)

Actual Results (My Reproduction)

Reproduction Steps

My Environment

Attachments

Question

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reproduction results are slightly lower than reported in the paper #2

Description

Expected Results (from the paper)

Actual Results (My Reproduction)

Reproduction Steps

My Environment

Attachments

Question

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions