EOFError: Source and reference streams have different lengths

panosk · February 21, 2020, 6:32pm

Hello,

Right after evaluation, I get the error in the title from sacrebleu.py. There are no empty lines in my validation files, so I’m stuck. Any hints will be appreciated.

guillaumekln · February 21, 2020, 8:52pm

Hi,

Can you check the number of lines in the generated prediction file (under eval directory of the model dir)? How is it different than the source validation file?

panosk · February 21, 2020, 10:03pm

The latest prediction file has the same number of lines as the source evaluation file. Now I have removed a few suspicious lines with 1 or 2 characters and waiting. Could it be that the source eval file has special tokens in the end of each line? For example:

source: ▁III ｟P｠｟L｠
target: ▁III

Now I’m waiting for the checkpoint in order to see the next evaluation attempt but I’d rather isolate this. Is there a way to run only the evaluator and try to find the problem?

Edit: Of course there is a way, sorry…

panosk · February 22, 2020, 8:00am

As I should have been resting after a very hectic day instead of squeezing the last bit of brain cell, it’s me to blame… I had given the train_labels_file in the eval_labels_file… Sorry for the false alarm.