I used the SentencePiece model as a tokenizer. When I encoded src text file using SPM encoder, everything is fine.
spm_encode --model=eng.model --output_format=piece input.txt --output input_tok.txt --extra_options=bos:eos
While using translation command on the model, few lines translated as empty, and it’s not that those sentences are small 1 word, phrases. These sentences were complete sentences.
onmt_translate -model Eng_Fr_Model.pt -src input_tok.txt -output fr_output_tok.txt -replace_unk -verbose
I am trying English-French Machine Translation on the Europarl dataset.
Any suggestions, how to resolve this??