Resuming from a checkpoint. Loss and perplexity increase

ambitious_folk · August 26, 2019, 12:34pm

Hello everyone,

I am having a problem with OpenNMT-py when trying to resume from a checkpointed model.
I am noticing that the perplexity and the training loss is way higher than what it was compared to the last step of the checkpointed model before saving it.

Did anyone encountered the same problem? It may be caused by the sharding effect and the random shuffle, so when resuming the data changes but even though I am not convinced by this explanation.

Cheers

vince62s · August 26, 2019, 5:50pm

did you shuffle your data before hand?