-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
Hi DiffSynth team, many thanks for open-sourcing this great repo. I have a question regarding the lora_target_modules choice when finetuning the QWenImageEdit series. For example for img_mlp, why only lora finetuning the img_mlp.net.2 although there is also a linear layer in img_mlp.net.0.proj? Also why isn't the proj_out being finetuned? Any specific reason to keep those component frozen?
class QwenFeedForward(nn.Module):
def __init__(
self,
dim: int,
dim_out: Optional[int] = None,
dropout: float = 0.0,
):
super().__init__()
inner_dim = int(dim * 4)
self.net = nn.ModuleList([])
self.net.append(ApproximateGELU(dim, inner_dim))
self.net.append(nn.Dropout(dropout))
self.net.append(nn.Linear(inner_dim, dim_out))
def forward(self, hidden_states: torch.Tensor, *args, **kwargs) -> torch.Tensor:
for module in self.net:
hidden_states = module(hidden_states)
return hidden_states
Metadata
Metadata
Assignees
Labels
No labels