[2023-05-10 14:49:30,945 INFO] Step 21350/30000; acc: 99.8; ppl: 3.5; xent: 1.3; lr: 0.00002; sents: 5686; bsz: 482/ 356/14; 9455/6986 tok/s; 9210 sec;
[2023-05-10 14:49:51,290 INFO] Step 21400/30000; acc: 100.0; ppl: 3.5; xent: 1.3; lr: 0.00002; sents: 6021; bsz: 481/ 357/15; 9456/7011 tok/s; 9230 sec;
[2023-05-10 14:50:11,744 INFO] Step 21450/30000; acc: 100.0; ppl: 3.5; xent: 1.3; lr: 0.00002; sents: 5844; bsz: 480/ 358/15; 9394/7001 tok/s; 9251 sec;
[2023-05-10 14:50:31,965 INFO] Step 21500/30000; acc: 99.8; ppl: 3.5; xent: 1.3; lr: 0.00002; sents: 5834; bsz: 481/ 357/15; 9522/7055 tok/s; 9271 sec;
[2023-05-10 14:50:37,403 INFO] valid stats calculation and sentences rebuilding
took: 5.437236547470093 s.
[2023-05-10 14:50:37,405 INFO] Train perplexity: 5.66592
[2023-05-10 14:50:37,405 INFO] Train accuracy: 91.9147
[2023-05-10 14:50:37,405 INFO] Sentences processed: 2.55020e+06
[2023-05-10 14:50:37,405 INFO] Average bsz: 482/ 357/15
[2023-05-10 14:50:37,405 INFO] Validation perplexity: 540.74
[2023-05-10 14:50:37,405 INFO] Validation accuracy: 22.2186
[2023-05-10 14:50:37,407 INFO] Saving checkpoint en-gez-bpe-bilingual_step_21500.pt
why is the train accuracy calculated at each checkpoint_steps differs significantly from the acc calculated at each step? I was expecting the train accuracy at each checkpoint_step is like an average.
I also got acc of 100.0, Is that realistic? I’m training a bilingual translation model using just 5k sentences.