Hello, I would like to ask why was 397b dropped from plan, i tried to follow paper and train my own, but it works only on eval data. Thank you, dflash is great, currently using 122b variant.
Hello,
I would like to ask why was 397b dropped from plan, i tried to follow paper and train my own, but it works only on eval data.
Thank you, dflash is great, currently using 122b variant.