I’ve tried opennmt-py and so far, I’m delighted with the results, but I believe I can get some even better results if I can improve a little bit my input data.
I hope I’m posting this to the proper place… sorry if I’m not. It just seemed to me to be the best fitted place.
Here are the 2 questions that I’m not sure about:
Does it make a difference if my sentences are not always “complete”? They are relatively long, but my really long sentences are usually split in 2 or 3 if they were extremely long.
I have some really old data that is really good, but not as good as the new one… what would be the best approach to use it, yet give priority to the newest data? providing the sentences in a certain order would be sufficient?