Trying to add a new language (Romany) and translate to English or viceversa

Hi guys,

Trying to help some communities of people in my country that are a minority and underrepresented and their language is not so known.

I can get a lot of text translated from Romany to English and viceversa and hopefully have tens of thousands of sentences translated.

I was looking at some of the quickstart tutorials and came across this Google Collab script:

What are the steps i need to take to be able to train the AI to be able to translate from english to romany for example. i understand it wont be accurate as it will need a lot of data, but even some small progress can mean a lot.

Need to mention i am not a machine learning expert, i am a developer, but i have very very limited knowledge of using machine learning algorithms.

From what i understood i need those sentences put into a text file, in english first and then their Romany counterpart in another file. I will also need to have probably 2 validation files with different sentences translated.

Could i use this Google Collab:

and replace those files being used? Basically replacing the german ones with the Romany translated ones?

If someone can please help me with some step by step guide, or maybe someone wants to jump on a call with me to guide me a bit i would really appreciate it.

Thank you so much,



I believe you would have more success by asking really specific questions that you come across while you are trying to figure how machine learning works rater than asking someone to do a walk through with you. If this is really what you desire you should consider taking an online course.

There is lots of documentation online, i would really recommend you to have fairly basic understanding of the concept of machine learning (if you don’t have it already) before trying with the colab.

Best regards,