Japanese training

liluhao1982 · July 19, 2017, 8:04am

Thanks.
If so, it is better to add an option disable tokenization when using the rest server instead of removing automatic tokenization/detokenization as this is great feature and in most time automatic tokenization/detokenization can work well except some language which need special morphological analyzer, e.g: Japanese, Chinese…, I required automatic tokenization/detokenization for rest server previously.

tnkmsh · November 2, 2017, 6:05am

Thanks for the good information. I’m also trying translation including Asian languages and I’d like to apply this tokenizer to ONMT, but I have no idea how to change options. Do I need to change the source code?

Mahi · July 7, 2018, 6:19am

Hi Jean,

We need to build Japanese to English models & vice versa language pair , would you be able to guide on approach or do we have any available ready to use paid models to do this .
Please let me know.

Thanks
Mahi