Hi
I trained the model without finetuning the VAE, and I am getting results that seems good in some cases but are not at the quality level of the provided model checkpoint.
I suspect this is due to the usage of the finetuned VAE encoder.
Could you please clarify the procedure you follow to finetune the VAE? Are you finetuning it together with the UNet backbone, or are you first finetunig the VAE and then finetuning the UNet?
Thanks
Hi
I trained the model without finetuning the VAE, and I am getting results that seems good in some cases but are not at the quality level of the provided model checkpoint.
I suspect this is due to the usage of the finetuned VAE encoder.
Could you please clarify the procedure you follow to finetune the VAE? Are you finetuning it together with the UNet backbone, or are you first finetunig the VAE and then finetuning the UNet?
Thanks