Computation of BLEU Score

I ran my NMT model in tensorflow and also make the translations for the test file.Now I am facing an issue when computing the blue score for my test file, I ran the following code for computing blue score

onmt-main score --config config/opennmt-defaults.yml config/data/toy-ende.yml --features_file data/toy-ende/src-test.txt --predictions_file data/toy-ende/predictions.txt

But somehow its not giving me a single bleu score for the entire file, its printing out a score for each line in my predictions file. My configurations are as follows

eval:
batch_size: 32
eval_delay: 3600000
external_evaluators: BLEU-detok

Am I doing something wrong here?

Scoring is meant to get the negative log likelihood of an existing translation, see the documentation.

Looks like you just want to infer from data/toy-ende/src-test.txt which will produce the file data/toy-ende/predictions.txt and then use any scoring script.

Should we need to write the scoring script by ourselves for finding out the BLEU score? Or is there any preexisting code available in OpenNMT TF?

I just simply want to find out the blue score given the true translation file and the file containing the predicted translations. What is the code snippet which I need to use for it?

Run:

onmt-main infer --config config/opennmt-defaults.yml config/data/toy-ende.yml --features_file data/toy-ende/src-test.txt --predictions_file data/toy-ende/predictions.txt

And then run any BLEU script you want on data/toy-ende/predictions.txt, e.g. multi-bleu.perl

1 Like

Thanks a lot for your support.

A post was split to a new topic: Understanding translation output