I want to check my model’s bleu score.
I checked the tutorial how to score it, but I wonder how can i get the common reference set for scoring!
Is there any frequently used English reference set?
If not, how you make efficient references?
Please tell me if you have any tip or advice about this.
the best test set will depend on the kind of data you used to train your model.
For instance, if you used data from a newswire media it will be a good choice to use the
newscommentary test and dev sets to evaluate your model.
If you used biomedical data to train, there are some biomedical test/dev sets available as well.
There exist several different corpora depending on the domain you want to work into .
Typically, you can find those train, dev and test sets from the wmt shared translation tasks
(here you can see the last one: wmt2017 )
Also, you can find more corpora in the opus website.
Thank you @emartinezVic!
I will check these websites!