Evaluation metric

Hi there,

I am using OpenNMT v1.0.0rc1. May I request you to tell me what kind of metric it uses to evaluate the final predicted equations. I am curious because in my case, the predicted equations are same to that of the target equation but still it gives very bad results while when I checked BLEU score, I have got ~83.

Here is an example taken from the final pred.txt file generated after testing:

an equation from pred.txt file →

[2021-05-30 16:08:59,775 INFO]
SENT 5001: None
PRED 5001: <math> <mi> l </mi> <mo> = </mo> <msub> <mi> j </mi> <mi> k </mi> </msub> </math>
PRED SCORE: -0.1977

Actual equations →

<math> <mi> l </mi> <mo> = </mo> <msub> <mi> j </mi> <mi> k </mi> </msub> </math>

If you see the predicted and target equations are same but still I have got so bad results. May I request you to please tell me why I am seeing this problem?

Any help will be appreciated. Thank you!

The PRED score is -0.1977…, the closer to 0 the best. So in your case your model is extremely “confident” of it’s translation. Based on my personal results, above -3 is fairly good depending on your model… The predict score really depend on each model, but you can still prety much assume that something above -3 is somewhere between fair and good.

1 Like