Thanks for your great work!
I have a question about query-text contrastive loss:
Here flattens the tensors, originally shaped as (B, N, C), into (B, N*C). Do the flattened tensors correspond to $q^{txt}_i$ and $q^{obj}_i$ in Equation 1 of the paper? Thank you!

Thanks for your great work!$q^{txt}_i$ and $q^{obj}_i$ in Equation 1 of the paper? Thank you!
I have a question about query-text contrastive loss:
Here flattens the tensors, originally shaped as (B, N, C), into (B, N*C). Do the flattened tensors correspond to