when i finish a shard in translate it shows that pred avg score is -0.5 but when I look into each sentence its show from -5 → -14 in PRED SCORE. I think Pred avg score = sum(pred score) / count(pred score) but its seem don’t like that . Can some one give me how to calculated this number ?
Looking up it in the GitHub repository, I found this line:
Still, I believe BLEU and/or COMET would make more sense for translation evaluation.