Empty lines returned by model

Hi all,

I’ve been training a punctuation restoration model using tranformers. Data has the following format:

src: this is a test
tgt: NONE NONE NONE PERIOD

While evaluating different checkpoints I found that BLEU score oscillates between 35 and 0 for different checkpoints. Ex: checkpoint_10000 (35BLEU), checkpoint_20000 (0.23 BLEU), checkpoint_30000 (33.4 BLEU) and so on. When looking at the output I found most of the lines are empty. It seems a strange behaviour? Any guess from where this behaviour might come?

image

Hi,

How do the training and validation losses look like? It does not seem like the model is converging.

I can’t say, I deleted all the training data. If the problem is the model is not converging, what should I modify?

Reducing the learning rate for example.