Uh, It seems that this problem is not easy to reproduce.
I try to reproduce it with demo data of OpenNMT.
Using CPU only, it works fine. Using one GPU, I cancel and continue through all 13 epoch again and again. Nothing abnormal situation happen.
I guess it would be emerge when using two GPU, but the two GPU machine is training model. I will be try again when that machine is available.