Frequently Asked Questions (FAQ)

Installation

Q: After `pip install -e .`, running an open-source model throws `ModuleNotFoundError: No module named 'diffusers'`

Cause: Open-source model dependencies (torch, diffusers, transformers, etc.) are not included in VBVR-EvalKit's core dependencies and must be installed separately.

Solutions:

Option 1: Install model dependencies in the main venv (recommended for quick testing):

source venv/bin/activate
pip install diffusers transformers accelerate torch torchvision

Option 2: Use the model installation script (creates an isolated venv):

bash setup/install_model.sh --model svd

Note: The installation script creates an isolated venv under envs/{model-name}/, but the current inference script imports model dependencies directly in the main process, so these packages also need to be available in the main venv.

Q: Installation script errors with `No matching distribution found for torchvision==0.20.1`

Cause: Older pinned versions of torch/torchvision are unavailable for Python 3.13. Pinned version numbers may not work with newer Python versions.

Solution: Edit setup/models/{model}/setup.sh to remove version pins:

# Before (may be incompatible)
pip install -q torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1

# After (installs compatible versions automatically)
pip install -q torch torchvision torchaudio

Q: `nvidia-smi` reports `Driver/library version mismatch` or CUDA is unavailable

Cause: NVIDIA driver version doesn't match the CUDA library version.

Solutions:

Models like SVD support CPU fallback and will automatically use float32 on CPU (slower but functional)

For GPU acceleration, update the NVIDIA driver or install a matching CUDA version:

# Check current driver version
cat /proc/driver/nvidia/version
# Install matching CUDA toolkit

Inference

Q: Tasks show `Skipped (existing)` when running `generate_videos.py`

Cause: A {task_id}.mp4 already exists in the output directory; completed tasks are skipped by default.

Solutions:

# Option 1: Use --override to clear output directory and re-run
python examples/generate_videos.py --questions-dir ./questions --model svd --override

# Option 2: Manually delete the corresponding mp4 file
rm outputs/svd/test_task/test_0000.mp4

Q: `--list-models` errors with `the following arguments are required: --model`

Cause: --model is a required argument, even when using --list-models.

Solution: List models directly via Python:

python -c "
from vbvrevalkit.runner.MODEL_CATALOG import AVAILABLE_MODELS, MODEL_FAMILIES
for f, ms in MODEL_FAMILIES.items():
    print(f'{f} ({len(ms)}):')
    for m in ms: print(f'  {m}')
print(f'\nTotal: {len(AVAILABLE_MODELS)} models')
"

Q: Inference is extremely slow in CPU mode

Cause: Open-source models use float32 arithmetic on CPU, which is computationally expensive.

Reference timing (SVD, CPU, single image): ~2 minutes per task

Recommendations:

Use a GPU environment (16GB+ VRAM recommended)

Or use commercial API models (no GPU required):

python examples/generate_videos.py --questions-dir ./questions --model luma-ray-2

Evaluation

Q: How do I run VBVR-Bench evaluation?

VBVR-Bench is the evaluation system for VBVR-EvalKit. It uses 100+ task-specific rule-based evaluators — no API calls needed:

# Basic evaluation (task_specific dimension only)
python examples/score_videos.py --inference-dir ./outputs

# Full 5-dimension weighted score
python examples/score_videos.py --inference-dir ./outputs --full-score

# With GT data
python examples/score_videos.py --inference-dir ./outputs --gt-base-path /path/to/gt --device cuda

Q: What do the scoring dimensions mean?

Dimension	Weight	Description
`first_frame_consistency`	15%	How well the first frame matches the input image
`final_frame_accuracy`	35%	Whether the final frame matches the expected result
`temporal_smoothness`	15%	Consistency between consecutive frames
`visual_quality`	10%	Sharpness, noise levels
`task_specific`	25%	Task-specific reasoning correctness

Default mode returns only task_specific. Use --full-score for the weighted combination.

Q: Can I resume an interrupted evaluation?

Yes. VBVR-Bench saves progress after each task. Simply re-run the same command — already-evaluated tasks are automatically skipped.

Environment Variables

Q: Which API keys are needed?

API keys are only needed for inference with commercial models, not for evaluation.

Model Family	Environment Variable	How to Obtain
Luma	`LUMA_API_KEY`	Luma AI website
Google Veo	`GEMINI_API_KEY`	Google AI Studio
Kling AI	`KLING_API_KEY`	Kling AI website
Runway	`RUNWAYML_API_SECRET`	Runway ML website
OpenAI Sora	`OPENAI_API_KEY`	OpenAI platform

cp env.template .env
# Edit .env and fill in the corresponding keys

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frequently Asked Questions (FAQ)

Installation

Q: After `pip install -e .`, running an open-source model throws `ModuleNotFoundError: No module named 'diffusers'`

Q: Installation script errors with `No matching distribution found for torchvision==0.20.1`

Q: `nvidia-smi` reports `Driver/library version mismatch` or CUDA is unavailable

Inference

Q: Tasks show `Skipped (existing)` when running `generate_videos.py`

Q: `--list-models` errors with `the following arguments are required: --model`

Q: Inference is extremely slow in CPU mode

Evaluation

Q: How do I run VBVR-Bench evaluation?

Q: What do the scoring dimensions mean?

Q: Can I resume an interrupted evaluation?

Environment Variables

Q: Which API keys are needed?

FilesExpand file tree

FAQ.md

Latest commit

History

FAQ.md

File metadata and controls

Frequently Asked Questions (FAQ)

Installation

Q: After pip install -e ., running an open-source model throws ModuleNotFoundError: No module named 'diffusers'

Q: Installation script errors with No matching distribution found for torchvision==0.20.1

Q: nvidia-smi reports Driver/library version mismatch or CUDA is unavailable

Inference

Q: Tasks show Skipped (existing) when running generate_videos.py

Q: --list-models errors with the following arguments are required: --model

Q: Inference is extremely slow in CPU mode

Evaluation

Q: How do I run VBVR-Bench evaluation?

Q: What do the scoring dimensions mean?

Q: Can I resume an interrupted evaluation?

Environment Variables

Q: Which API keys are needed?

Q: After `pip install -e .`, running an open-source model throws `ModuleNotFoundError: No module named 'diffusers'`

Q: Installation script errors with `No matching distribution found for torchvision==0.20.1`

Q: `nvidia-smi` reports `Driver/library version mismatch` or CUDA is unavailable

Q: Tasks show `Skipped (existing)` when running `generate_videos.py`

Q: `--list-models` errors with `the following arguments are required: --model`