Hi, @kcz358
Thanks for releasing the checkpoints and the repository!
I am attempting to train Qwen2.5-VL-3B using this repo. However, I’ve noticed that the GPU utilization is quite low, and there are significant pipeline bubbles during training.
Could you provide some guidance on how to optimize the training efficiency or suggest any specific configurations?
Thanks!
Hi, @kcz358
Thanks for releasing the checkpoints and the repository!
I am attempting to train Qwen2.5-VL-3B using this repo. However, I’ve noticed that the GPU utilization is quite low, and there are significant pipeline bubbles during training.
Could you provide some guidance on how to optimize the training efficiency or suggest any specific configurations?
Thanks!