Resuming from a checkpoint. Loss and perplexity increase

Hello everyone,

I am having a problem with OpenNMT-py when trying to resume from a checkpointed model.
I am noticing that the perplexity and the training loss is way higher than what it was compared to the last step of the checkpointed model before saving it.

Did anyone encountered the same problem? It may be caused by the sharding effect and the random shuffle, so when resuming the data changes but even though I am not convinced by this explanation.


did you shuffle your data before hand?