Arabic machine translation

(Fodil B) #1

Hello sir,

I m generating a translation model between Arabizi (arabic dialecte written in latin script ) and arabic . the probleme is that i am getting a blue score of 0.0 . i don’t know if OpenNMT support arabic lenguage or not !! please could someone help.

(Guillaume Klein) #2


OpenNMT works with any unicode languages. If you describe your data preparation and training process, people may be able to assist you.

(Fodil B) #3

Hello , i have a dataset of 3000 sentence , i am using the quistart : here it the configuration i am using
for preprocessing :
th preprocess.lua -train_src data/src-train.tok -train_tgt data/tgt-train.tok -valid_src data/src-val.tok -valid_tgt data/tgt-val.tok -save_data data/demo

For training:
th train.lua -encoder_type brnn -global_attention dot -max_batch_size 64 -data data/demo-train.t7 -save_model model_trans/demo-model

(Guillaume Klein) #4

Sorry, 3000 sentences is usually not enough to train a NMT system.

If possible, you should try with a couple hundred thousand sentences to start getting good result.