Skip to content

多卡 context parallel 问题? #10

@wzr0108

Description

@wzr0108

启动命令

source .venv/bin/activate
cd /mnt/workspace/wzr_workspace/SCAIL-2

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
DIR="模特换视频/replace_cases/简单_无运镜_无背景_动作简单__case_01_level_0"

torchrun --standalone --nproc_per_node=8 generate.py \
    --model SCAIL-14B \
    --ckpt_dir /mnt/workspace/wzr_workspace/SCAIL-2/hub/ZhipuAI/SCAIL-2 \
    --scail_path /mnt/workspace/wzr_workspace/SCAIL-2/SCAIL-2.safetensors \
    --replace_flag \
    --target_w 704 --target_h 960 \
    --image "${DIR}/ref.jpg" \
    --mask_image "${DIR}/ref_mask.png" \
    --pose "${DIR}/rendered_v2.mp4" \
    --mask_video "${DIR}/replace_mask.mp4" \
    --prompt "视频中的人在做动作" \
    --save_file "${DIR}/output.mp4" \
    --ulysses_size 8 \
    --ring_size 1 \
    --offload_model False

遇到这个错误

[rank1]: Traceback (most recent call last):
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/generate.py", line 456, in <module>
[rank1]:     generate(args)
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/generate.py", line 450, in generate
[rank1]:     generate_video(scail_pipeline, prompt, image_path, image_mask_path, pose_path, driving_mask_path, args, device, rank, cfg, input_idx, args.replace_flag, additional_task_input)
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/generate.py", line 323, in generate_video
[rank1]:     video = pipeline.generate(
[rank1]:             ^^^^^^^^^^^^^^^^^^
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/wan/scail.py", line 511, in generate
[rank1]:     videos = sample_func(noise, arg_c, arg_null, history_latent)
[rank1]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/wan/scail.py", line 391, in sample_func
[rank1]:     noise_pred_cond = self.model(
[rank1]:                       ^^^^^^^^^^^
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/mnt/workspace/wzr_workspace/SCAIL-2/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: TypeError: usp_dit_forward() got an unexpected keyword argument 'ref_latents'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions