Use torch.load() to inference model output

what should I do if I want to use the torch load function to get the output.
And my input is the same as src_train.txt. (example: this is a book)
Should I change the input setences to token embedding feature?
Can anyone help me. I don’t want to use the big

Try CTranslate2. The forum has a lot about it under this topic.