-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 Feature
Add a TokensLoaderWithMeta class that stores some additional parallel meta data with the tokens. It can be used to store data with a bit more structure than flat sequence, like image tokens. Here's an example:
{
"token": [1,2,3,4,5],
"token_x": [0,0,0,1,1],
"token_y": [0,1,2,0,1]
}Notice they all have the same length.
Motivation
I've been using TokensLoader to train models and find it to be really handy. But it's unfortunately a bit difficult to use when I want to experiment with different positional encoding schemes.
Alternatives
The alternative is to create a normal LitDataset with these. But it is less efficient to store, load, and harder to pack samples.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request