Skip to content

Question about cosine similarity #36

@yezhyzh

Description

@yezhyzh

It seems that cosine similarity was not used during training.These are some of the logits I obtained during repro training, and they are significantly greater than 1.
logits tensor([[15.2290, 7.5998, 9.8649, ..., 7.7345, 9.5413, 16.3422],
[ 7.4700, 15.4301, 6.9831, ..., 11.5478, 6.7609, 16.8512],
[10.8048, 9.7472, 11.9755, ..., 7.9214, 9.1346, 14.9441],
...,
[ 9.1644, 8.0641, 8.9382, ..., 9.6266, 11.3413, 16.5944],
[ 7.5931, 10.8228, 6.1488, ..., 15.4235, 6.2452, 16.2435],
[11.1124, 6.7673, 9.4720, ..., 7.7954, 14.2624, 16.0656]],

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions