This is truly a remarkable piece of work! Many thanks to the author for open-sourcing it.
I have a question about the critic loss computation.
From my understanding, the student predicts $x_0$ using the generator's scheduler (denoted as scheduler_stu). Then scheduler_stu is used to add noise, and the resulting noisy sample is passed to the fake/real teachers.
However, the teachers use a scheduler with a different shift value from the student's scheduler to predict $x_0$ from pred_flow. After that, the code converts $x_0$ back to pred_flow again using the generator's scheduler.
I am wondering why the critic loss is not computed directly from the teacher's original output (pred_flow).
Instead, pred_flow is first transformed to $x_0$ and then converted back using a different shift setting. Is this intentional, or could it be a bug?
file path model/dmd.py
_, pred_fake_image = self.fake_score(
noisy_image_or_video=noisy_generated_image, # !!!!!!!!!!!!!!!!!
conditional_dict=conditional_dict,
timestep=critic_timestep
) # Using fake teacher's scheduler, default shift = 8
flow_pred = WanDiffusionWrapper._convert_x0_to_flow_pred(
scheduler=self.scheduler,
x0_pred=pred_fake_image.flatten(0, 1), # !!!!!!!!!!!!!!!!!
xt=noisy_generated_image.flatten(0, 1),
timestep=critic_timestep.flatten(0, 1)
) # Using generator's scheduler, shift was set to 5.
This is truly a remarkable piece of work! Many thanks to the author for open-sourcing it.
I have a question about the critic loss computation.
From my understanding, the student predicts$x_0$ using the generator's scheduler (denoted as
scheduler_stu). Thenscheduler_stuis used to add noise, and the resulting noisy sample is passed to the fake/real teachers.However, the teachers use a scheduler with a different shift value from the student's scheduler to predict$x_0$ from $x_0$ back to
pred_flow. After that, the code convertspred_flowagain using the generator's scheduler.I am wondering why the critic loss is not computed directly from the teacher's original output (
pred_flow).Instead,$x_0$ and then converted back using a different shift setting. Is this intentional, or could it be a bug?
pred_flowis first transformed tofile path
model/dmd.py