I follow the steps you provide strictly, but the loss is -nan from beginning. The training data is properly converted, what is the reason?
I follow the steps you provide strictly, but the loss is -nan from beginning. The training data is properly converted, what is the reason?