German -> English pre-trained model issues

For the pre-trained German -> English model, I get a lot of <unk> in the translation output even for the training sentences. I saw the training pre-processing script
which uses moses tokenization, lowercasing followed by BPE encoding, but the results worsen with use of BPE in my case. Also, I get even common german words (like Fußball) as it is in my English output.

  1. I am using python -c <code_file> < <input_file> > output_file. Should I also give some vocab file as input ?
  2. This model is trained using the code with SHA d4ab35a, is there any reason it should misbehave at inference time with the latest code?
  3. Is there a decoding step required when I get the English output, as was required in case of sentencepiece? I suppose sed -r 's/(@@ )|(@@ ?$)//g' should be used for decoding?
    Thanks for your patience

I’d be grateful if you could give me some help on this issue of how to preprocess the data for input to German -> English translation model (PyTorch)