Introducing is a state-of-the-art open source Japanese ⇋ English machine translation system leveraging OpenNMT.

If you are interested in the development check out the research paper:

If you would like any of the data / development notebooks just let me know, source code is on my Github.

Many thanks to the OpenNMT community for all the help and especially Yasmin Moslem for providing great tutorials and tooling.


good job. I have some questions:

  1. Why the model is so small? The model files in Github look like smaller than 10M.
  2. How many bilingual sentences you had to trained?
  3. Can this model return the alignment information ?
    Thanks for your sharing. Really good job.

Hi thanks for checking the project out.

1.) Actually those are the SentencePiece models, the Ctranslate2 models are too large to put on Github (maybe I can use LFS but I am not familiar with it).

2.) 15, 847, 871 parallel sentences.

3.) I don’t think the ctranslate models are able to do so, OpenNMT itself can but inference would be a lot slower I think.

1 Like

Congratulations, Matthew! Great work. :tada:

Recently, I started to add big files under “Releases” on the relevant GitHub repository.

Kind regards,

1 Like

Thanks for the tip that was easy to do.

1 Like