Skip to content

Support Qwen2-VL from huggingface and LLaVA-1.5 on transformers v4.45. Add Colab notebook for experiments.#58

Open
original-doc wants to merge 8 commits into
pkunlp-icler:mainfrom
original-doc:main
Open

Support Qwen2-VL from huggingface and LLaVA-1.5 on transformers v4.45. Add Colab notebook for experiments.#58
original-doc wants to merge 8 commits into
pkunlp-icler:mainfrom
original-doc:main

Conversation

@original-doc
Copy link
Copy Markdown

We reimplemented FastV on Qwen2-vl and LLaVA-1.5 based on newer transformers v4.45.0, which only requires to install the local transformers module, making it more Plug-and-Play. We also add a colab_fastv.ipynb to help setup environment on cloud and reproduce some of the experiments from the paper easily.

For this modified code, environment setup is easier (Python 3.10 is recommended):

pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
pip install pillow accelerate
pip install datasets tqdm matplotlib

cd FastV 
cd ./src/FastV/llava-hf/transformers
pip install -e .
cd ../../../..

Here're some test result to prove the performance of our implementation with A100-SXM4-80GB of Colab.
bars_llava_aokvqa
bars_qwen2vl_aokvqa

@original-doc
Copy link
Copy Markdown
Author

Our modification locates on the six files.

  • src\FastV\llava-hf\transformers\src\transformers\models\llama\configuration_llama.py
  • src\FastV\llava-hf\transformers\src\transformers\models\llama\modeling_llama.py
  • src\FastV\llava-hf\transformers\src\transformers\models\llava\configuration_llava.py
  • src\FastV\llava-hf\transformers\src\transformers\models\llava\modeling_llava.py
  • src\FastV\llava-hf\transformers\src\transformers\models\qwen2_vl\configuration_qwen2_vl.py
  • src\FastV\llava-hf\transformers\src\transformers\models\qwen2_vl\modeling_qwen2_vl.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants