Hi everyone,
I live in French Guyana, where it exists a various number of indigenous languages. I’m looking to empowering those cultures by helping them to translate their languages. I did the first try with the Palikur language following this basic tutorial: https://hackernoon.com/neural-machine-translation-using-open-nmt-for-training-a-translation-model-1129a3a2a2d3 and running it on a google cloud computer.
I have some texts translated Palikur-French, Palikur-Portuguese and also some Palikur-English. I’ve more than 12.000 lines. I can easily have the double of it, but I guess it’s not enough, do you know how many lines I need for a BLEU score of 40-50? (I have 3,5 for the moment for Palikur-French) To see how much money I may need to raise to collect sentences.
Because this tutorial was bilingual I’ve used google translate on my Portuguese sentences (that’s bad I know), but I guess I could use a model, maybe already trained for French-Portuguese-English and try to add it Palikur?
Can you recommend some lectures/texts/tutorials to try to build the first prototype? It’s a non-profit and open-source project. The main goal is to build bridges between unknown languages and known ones.