Skip to content

Conversation

@Ringssss
Copy link

测试结果:

  • LLaVA(llava-1.5-7b-hf):
    • bs=64 baseline 378.02/2294.92 → compress 319.57/3508.29
    • bs=128 baseline/compress 均 OOM
  • MiniCPM‑V(MiniCPM‑V‑2_6):
    • bs=1 baseline 427.43/87.53 → compress 364.16/88.48
    • bs=8 baseline 1118.72/651.48 → compress 1030.15/664.39
    • bs=16 baseline 1232.59/1233.87 → compress 1164.00/1254.53
    • bs=32 baseline 1339.24/2335.95 → compress 1270.91/2368.02
    • bs=64 baseline 1370.84/4207.61 → compress 1290.92/4223.16

export PYTHONPATH=/home/zhujianian/131/InfiniLM2.0/InfiniCore/python:/home/zhujianian/131/InfiniLM2.0/InfiniLM/python
export LD_LIBRARY_PATH=/home/zhujianian/miniconda3/envs/sdmllm/lib:/home/zhujianian/131/InfiniLM2.0/InfiniCore/python/infinicore/lib:/home/zhujianian/.infini/lib

LLaVA

python examples/jiuge.py
--nvidia
--model_path /data/huggingface/llava-1.5-7b-hf
--image /home/zhujianian/cvpr/wuhang/bus.jpg
--prompt "Describe this image."
--max_new_tokens 32
--batch-size 1
--no-stop-on-eos
--kv-compress
--kv-compress-weight ./compress_ckpt/llava_mlp_local.bin
--kv-compress-factor 5
--kv-compress-min-seq 2
--kv-image-kv-len 0

MiniCPM-V

python examples/jiuge.py
--nvidia
--model_path /data/huggingface/MiniCPM-V-2_6
--image /home/zhujianian/cvpr/wuhang/bus.jpg
--prompt "图片是什么?"
--max_new_tokens 32
--batch-size 1
--no-stop-on-eos
--kv-compress
--kv-compress-weight ./compress_ckpt/minicpmv_mlp_local.bin
--kv-compress-factor 5
--kv-compress-min-seq 2
--kv-image-kv-len 0

@Ringssss Ringssss requested a review from a team January 23, 2026 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants