It would be great if there is some sort of a guide for inference best practices, especially for people like me, who have very little knowledge of NLP.
For example, I have noticed that when I tokenize a paragraph into sentences, then infer those as separate tokens, the translation seems better, and the error in translation become localized, so part of the paragraph is wrong, rather than passing a whole paragraph and get nonsense.
Another thing I’ve noticed. Simple changes in word tokenization can effect inferring easily. ("_He llo" vs “_He llo.”), it is a little strange why adding a period would effect translation.