I am trying to find BLEU score for the dataset 100K provided with the im2text model. I have run OpenNMT-py v1.0.0.rc1. im2text model and tried to find BLEU score passing flags in “onmt/bin/translate.py” but didn’t work. I have raised this issue and got an suggestion to get BLEU scores separately.
I have also run the same experiment using 100K dataset using LUA model which has separate evaluation scripts to get BLEU scores i.e. “scripts/evluation/evaluate_bleu.py”. The files I have got during calculation of BLEU scores are – “.tmp.gold.txt” and “.tmp.pred.txt”
The gold file is the reference file we have while the pred is the predicted equations file. The main problem that I am facing is why the pred file has so many spaces?.
For example:- after preprocessing, the equations are filtered based on token size(150) and image size but as per the scripts the final file should be in the format of - “each equation in one line”. But after looking into “.tmp.gold.txt file” it looks that the tokenized equations are divided and written in separate/extra lines while those extra lines are missing in the “.tmp.pred.txt” file. I apologize for making it confusing. Let me show you an example:
In the gold_screenshot.png, we can see that the equations are arranged in the mentioned order i.e. each equation in single line but the very first can be seen as broken into three seperate lines. Now the same equation in pred_screenshot can be seen as (first lien+2 blank lines). That is where the problem occurs for me, why those blank lines are there?
I am not suer if this is the cause if getting BLEU score of
BLEU = 1.29, 94.5/91.4/88.5/85.7 (BP=0.014, ratio=0.191, hyp_len=129634, ref_len=680211) for 100K dataset.
Sorry for the long and confusing doubt but any help would be appreciated. Thank you in advance!