[MiniCPM-o 4.5] LayerDrop appears disabled during training because 'encoder_layerdrop' is set to '0.0'

Hi, thank you for releasing MiniCPM-o-4_5 and making it available as open source.

I was looking into the MiniCPM-o-4_5 code and noticed that LayerDrop seems to be effectively disabled in the released checkpoint config.

In `modeling_minicpmo.py`, `MiniCPMWhisperEncoder` inherits from `transformers.models.whisper.modeling_whisper.WhisperEncoder`:

```python
class MiniCPMWhisperEncoder(WhisperEncoder):
    def __init__(self, config: WhisperConfig):
        super().__init__(config)
```

In Hugging Face Transformers, `WhisperEncoder.__init__` sets:

```python
self.layerdrop = config.encoder_layerdrop
```

Then during the forward pass, LayerDrop is only applied in training mode:

```python
if self.training:
    dropout_probability = torch.rand([])
    if dropout_probability < self.layerdrop:
        to_drop = True
```

However, in the released `config.json`, I see:

```json
"encoder_layerdrop": 0.0,
"decoder_layerdrop": 0.0
```

So even when the model is in training mode, `self.layerdrop` is `0.0`, which means no encoder layers will actually be skipped.

My questions are:

1. Is this intentional for MiniCPM-o-4_5?
2. If users fine-tune MiniCPM-o-4_5, should `encoder_layerdrop` remain `0.0`, or is there a recommended non-zero value?
3. Was LayerDrop used during the original training, or is this code path inherited from Whisper but not used for MiniCPM-o-4_5?

I just wanted to confirm whether the current behavior is expected or whether the released config should use a different LayerDrop value.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MiniCPM-o 4.5] LayerDrop appears disabled during training because 'encoder_layerdrop' is set to '0.0' #1118

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[MiniCPM-o 4.5] LayerDrop appears disabled during training because 'encoder_layerdrop' is set to '0.0' #1118

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions