Scaling Loss for Character Level Seq2seq model

A5U7 · June 28, 2020, 9:28pm

Is it recommended to scale up the loss of a character level seq2seq model by a suitable factor (suppose 7: an English word has 7 characters on an average) to help with better learning especially when the sentences are quite long (max_token_size=256)?

And if that’s the case, is there a way to pass a scaling factor as an argument to train.py?