I’m a web developer looking for a starting point to learning about OpenNMT.
There is a lot of jargon on the website&docs and having access to a more structured or perhaps beginner friendly resource would be really helpful.
Searching Google I could not find:
- More in depth Youtube videos
I did find https://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf however this presents how the actual neural network works.
I’m more interested in getting it running on my own hardware having an explanation for all the vocabulary of and the type of performance I can get out of it (for example benchmarks talk in tokens/second… are those characters or something else )
For a functionality tutorial to understand the process, I would suggest @ymoslem’s OpenNMT-py tutorial. It provides a good explanation of all the necessary steps in creating an NMT model, and provides you with all the tools/scrips to do so.
Most modern NMT uses subword tokenizers like Sentencepiece which take a huge amount of unsupervised text and creates a token vocabulary (tokenizer), so that using Sentencepiece’s data compression algorithms (similar to lossless compression) allows the breaking up of sentences into tokens (sub-word units) for which a model learns to piece together and translate during training. The paper which introduced this provides a more in-depth explanation.
Tokens/sec means how many tokens the model can be given and output within an amount of time, eg a batch of 4000 tokens would take roughly 1 second to translate (for a 4000tok/s speed), but that’s a ballpark as different tokens can sometimes take the model longer to generate a result for - to my understanding.
Hope this helps
Thank you for this intro. I’ll look at the OpenNMT-py tutorial and start from there.