Seems to work for me

Hi Phil, I tested it in my private project 2 days ago, and it seems to speed up learning quite significantly, not sure that final val/train losses are better, more like very similar to original but it got there much faster. Also i did not do different tasks/architects to compare, but my project contains few different nets one including tiny transformer, another using RNN cells and last one simple shallow convolutions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Seems to work for me #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Seems to work for me #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions