Paid job: set up OpenNMT DE->EN + FR->EN


#1

Hello,

I am a non-IT, non-Linux-versed, non-programming freelance translator and would like to commission the setup of OpenNMT on my desktop computer (on which I would install a Linux dual boot for this purpose) primarily for the language combination DE->EN, but also for FR->EN. In addition, I would require setup of a Trados plugin. According to another forum post here, there is a Trados plugin available here:

I’m a complete ignoramus as far as what is feasible / easy to implement and what not, but my ideal setup would include, if possible: ability for me to train/update different translation engines at will for different clients, subword tokenization, particularly for German, incorporation of phrase lists with technical glossaries, ability to connect to client engine of choice via Trados plugin, (and this last part is really totally optional, but would be a plus:) ability for subcontractors of mine to log in to client engine of choice via IP address, and preferably with password protection.

From my reading, I have understood that there isn’t “an” OpenNMT setup, but rather it’s a question of experimenting with various variables in order to attain the best translation result. And I assume the best combination is also dependent upon the language combination and types of texts. I don’t expect to fine-tune OpenNMT to precisely fit my projects, but it would be nice to get in touch with someone who has experience successfully implementing the language combinations DE->EN and FR->EN.

I am vaguely aware of the technical requirements and, if I can get this project moving, I would look into acquiring the GEFORCE GTX 1070 GPU for this purpose.

It would be ideal to receive basically a straight-out-of-the-box package I could just dump on computer and operate with minimal technical skill…

If you’re interested, feel free to PM me with feedback and perhaps a quote / ballpark figure.

Thanks in advance!


(Bachstelze) #2

Heyho,

I don’t think openNMT is that generic like you want it. The best approach would be to have already a big specific corpus in your language and domain field. Every time you want to update the corpus, you have (as far as I know) to learn the model weights completely new (which is very time consuming and not practical).
OpenNMt is for the complexity of this field a “ready to go” approach. If you want an approach experimenting with various variables have a look at https://github.com/Kyubyong/transformer and the blog post with BLEU scores and training time: https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html
This heavy systems are proposed as strong baseline, if you want a very dynamic and user specific system i would suggest https://www.modernmt.eu/