Validation and Translation Perplexity

dimitarsh1 · August 21, 2017, 2:28pm

Hello,

I am running some tests and trying to figure out something related to perplexity. During training, the validation set is used to validate each model and a perplexity score is generated.

During translation, also a score is generated (that’s the confidence score, according to the comments in the code). If I use the same model and the same set the validation perplexity score and the translation perplexity are different. Can someone say what perplexity refers to in both cases?

Thanks,
Dimitar

guillaumekln · August 21, 2017, 2:39pm

Hi,

The validation perplexity is the perplexity of the validation’s true target data.
The translation perplexity is the perplexity of the model’s own prediction.

See also this topic:

dimitarsh1 · August 21, 2017, 2:46pm

@guillaumekln,
thanks a lot for the prompt response.
Is there a way to compute the validation perplexity after the model is trained?

Thank you in advance.
Dimitar.

guillaumekln · August 21, 2017, 2:48pm

Yes, you can provide the target data during translation with the option -tgt. Then the GOLD perplexity should be comparable with the validation perplexity.

dimitarsh1 · August 21, 2017, 2:52pm

Yes, I checked it in the documentation meanwhile… sorry about the question :).

Cheers,
Dimitar.

dimitarsh1 · August 21, 2017, 3:43pm

When computing the GOLD perplexity I get a score closer to the validation perplexity computed during training.

However, they are not the same and I would say they are rather different actually - the one is 7.27, the other is 10.19. I would expect some variation, but that is a bit too much I think. Is that normal? (The prediction perplexity is 2.14).

Thank you,
Kind regards,
Dimitar

guillaumekln · August 21, 2017, 3:48pm

The preprocessing certainly dropped some validation sentences during training (due to length constraints) which are not filtered during translation. That’s the only additional difference I can think of right now.

dimitarsh1 · August 21, 2017, 3:50pm

OK, Interesting,

Thank you.
Cheers,
Dimitar.

ishaansharma · December 6, 2019, 5:36am

try increasing your vocab size and than give it a try.