OpenNMT Forum

Japanese training

(liluhao1982) #21

If so, it is better to add an option disable tokenization when using the rest server instead of removing automatic tokenization/detokenization as this is great feature and in most time automatic tokenization/detokenization can work well except some language which need special morphological analyzer, e.g: Japanese, Chinese…, I required automatic tokenization/detokenization for rest server previously.

(Tnkmsh) #22

Thanks for the good information. I’m also trying translation including Asian languages and I’d like to apply this tokenizer to ONMT, but I have no idea how to change options. Do I need to change the source code?

(Mahendra) #23

Hi Jean,

We need to build Japanese to English models & vice versa language pair , would you be able to guide on approach or do we have any available ready to use paid models to do this .
Please let me know.