Unable to reproduce pretrained model performance on validation set (DCASE 2023 baseline)

I am unable to reproduce the PSDS scores reported in the `recipes/dcase2023_task4_baseline/README.md` when evaluating the pretrained model from Zenodo on the validation set. The scores I obtain are approximately 10-20% lower than the reported values.

- Repository version: Latest commit (e28fb3c)
- Pretrained model: pretrained_audioset_epoch=199-step=11800.ckpt from https://zenodo.org/records/7759146
- BEATs model: Downloaded from the https://github.com/microsoft/unilm/tree/master/beats
- Python: 3.10
- PyTorch Lightning: 1.9.2
- Platform: Linux (Ubuntu)

## Results

I tested the `BEATs + AudioSet` model.

According to the README, the pretrained model should achieve:
- PSDS1 (Scenario 1): 0.480
- PSDS2 (Scenario 2): 0.765

Obtained scores (teacher model):
- PSDS1 (Scenario 1): 0.345 (approximately 15% lower)
- PSDS2 (Scenario 2): 0.583 (approximately 20% lower)

Evaluation command:
```
uv run train_pretrained.py --test_from_checkpoint ../../ckpt/pretrained_audioset_epoch=199-step=11800.ckpt --conf_file confs/pretrained.yaml
```

I didn't change any parameter written in `pretrained.yaml`. 

The validation dataset has some missing files due to YouTube video unavailability:
  - Total files in metadata: 1168
  - Successfully downloaded: 926 (79.3%)

But I filtered out missing files from the evaluation, ensuring that PSDS is calculated only on the files that were successfully downloaded (926 files).

So what causes this mismatch?
Thank you in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce pretrained model performance on validation set (DCASE 2023 baseline) #114

Results

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to reproduce pretrained model performance on validation set (DCASE 2023 baseline) #114

Description

Results

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions