Does there exist any type of easy to follow quick start guide for using one of the docker images for basic translation setup?

magali-joine · September 4, 2019, 12:54am

Does there exist any type of easy to follow quick start guide for using one of the docker images for basic translation setup? A step by step from start to finish using a local file system with sample data, minimal steps.

Background: I have sentence aligned corpus in Cherokee and English (mostly biblical) that I was previously experimenting with in Moses, but as Moses is EOL, have decided to look for an alternative. I am not a native speaker of Cherokee, but a student, looking to test out various translation approaches with Cherokee, such char2char or sub-word, but lack a step by step starting point for the docker images for basic translations like I was previously following for Moses.

Data: A combination of sources, much being the new testament, plus some programmatically conjugated entries based on the regular rule structure of Cherokee. A good portion of part of the corpus is poorly aligned at the sentence level and is more paragraph aligned. The corpus is spread out across multiple files. There does not exist test data or validation data. https://github.com/CherokeeLanguage/CherokeeEnglishCorpus

Goals: [1] Attempting translations from non-training text. [2] Build a vocabulary/dictionary phrase “list” for manual examination and review, especially useful when trying to determine and understand idiomatic expressions.

I do have docker installed and am running on Mint Mate.

The command “docker run nmtwizard/opennmt-tf -h” does complete successfully.
My attempts at Googling have been unfruitful, I assume to not using the correct search terms, or Google just be evil to me.
I will be using CPU and not GPU for the processing.

yaren · September 4, 2019, 2:33pm

As my experience, OpenNMT is vey easy to learning.
You can follow the quick stark here (python version):
http://opennmt.net/OpenNMT-py/quickstart.html

If you just use CPU, The training is very very very slowly.
So I suggest you to buy a GPU, even GTX1060 will be good enough for learning NMT.
About GPU for NMT, you can see this blog: https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/

magali-joine · September 5, 2019, 1:22am

So… how do I follow those instructions with a docker image?

All I see in reference to docker is:

Alternatively you can use Docker to install with nvidia-docker . The main Dockerfile is included in the root directory.

guillaumekln · September 10, 2019, 10:16am

You probably want to search for Docker tutorials online.