I am trying to understand why result of training is model that just repeats the same word once it shows up in output. E.g. w1 w2 w3 w3 w3 w3 w3 w3 w3.
From what I see, there is nothing specific about the frequency of the word, but maybe I am missing something.
Do you have some experience with this type of problems? Any advice on how to catch it early, to avoid wasting time on training all the iterations?