Empty lines returned by model

anderleich · September 7, 2020, 8:52am

Hi all,

I’ve been training a punctuation restoration model using tranformers. Data has the following format:

src: this is a test
tgt: NONE NONE NONE PERIOD

While evaluating different checkpoints I found that BLEU score oscillates between 35 and 0 for different checkpoints. Ex: checkpoint_10000 (35BLEU), checkpoint_20000 (0.23 BLEU), checkpoint_30000 (33.4 BLEU) and so on. When looking at the output I found most of the lines are empty. It seems a strange behaviour? Any guess from where this behaviour might come?

guillaumekln · September 14, 2020, 8:41am

Hi,

How do the training and validation losses look like? It does not seem like the model is converging.

anderleich · September 14, 2020, 9:04am

I can’t say, I deleted all the training data. If the problem is the model is not converging, what should I modify?

guillaumekln · September 14, 2020, 9:26am

Reducing the learning rate for example.