I am new to OpenNMT. I am trying to train an English-Arabic data, I’ve managed to create a model using glove pre-trained using ‘-emb_file_both’ opition but the accuracy was very bad. I guess because it contained only English vocab. My questions are:
- how can I add pretrained Arabic embeddings, do I have to combine it with the glove file and feed in one file that contains the two langauges’ word embeddigns?
- do you keep track of model accuracies for different languages trained on OpenNMT by different users?, it would be interesting for OpenNMT users to compare their results against baselines?
- Is there a way that OpenNMT model train word embeddings during the training process instead of using pretrained word-embeddings which may have lower number of words in the corpus?
I would appreciate it if you can provide tutorial for using different word-embeddings in OpenNMT.
Thank you