I am getting acquainted with toy-ende quickstart guide and I have a couple of quick questions about model scoring and inference. I hope these questions might also clarify things for others.
-
I have set up BLEU scoring by changing the yml file to include:
eval:
eval_delay: 300
save_eval_predictions: True
external_evaluators: BLEU
It seems that BLEU scores are only calculated after a training run concludes, is that intended behaviour?
In a quick training run lasting 5000 steps (18.5 minutes on 4 GPUs), why were BLEU scores calculated only at 2 points (11 minutes in and at the end, at 18.5 minutes) rather than every 300 seconds as I thought I specified with eval_delay: 300?
Which checkpoint is chosen for each evaluation? Say I ask for BLEU scores every 5 minutes, but checkpoints are saved every ~20 minutes, what happens then?
-
Is it possible to have BLEU scores calculated during training, for both eval and test sets, so I can monitor these in tensorboard as training proceeds?
-
The BLEU scores are extremely low (~0.2) even though MT output looks legible (probably related the point below)… any thoughts why this might be the case?
-
I have run inference using a trained model using the command
onmt-main infer --auto_config --config config.yml --features_file src-test.txt > predictions.txt
However, the predictions don’t seem at all related to the sentences in src-test.txt?
head -5 src-test.txt
Orlando Bloom and Miranda Kerr still love each other
Actors Orlando Bloom and Model Miranda Kerr want to go their separate ways .
However , in an interview , Bloom has said that he and Kerr still love each other .
Miranda Kerr and Orlando Bloom are parents to two-year-old Flynn .
Actor Orlando Bloom announced his separation from his wife , supermodel Miranda Kerr .
head -5 predictions.txt
32 ist fĂĽr Windows Versionen fĂĽr Wien .
Kroatien Sie bei Bratislava / Pressburg Flughafen / 56 km .
Es hat jedoch jedoch um eine Tag , um eine in der EU in diesem Union .
Sie vom Installation / 30 teil .
Zusammen für drei Hügel aus der eigenen Entschärfung , , private Seen sind eine Rechts .
head -5 tgt-test.txt
Orlando Bloom und Miranda Kerr lieben sich noch immer
Schauspieler Orlando Bloom und Model Miranda Kerr wollen kĂĽnftig getrennte Wege gehen .
In einem Interview sagte Bloom jedoch , dass er und Kerr sich noch immer lieben .
Miranda Kerr und Orlando Bloom sind Eltern des zweijährigen Flynn .
Schauspieler Orlando Bloom hat sich zur Trennung von seiner Frau , Topmodel Miranda Kerr , geäußert .
Are these predictions jumbled up somehow? The sentences do not seem aligned. Is this why the BLEU scores are so low? How can I connect/unscramble the predictions to the true labels in tgt-test so I can evaluate MT quality?
Thanks a ton in advance!
Nat