OpenNMT Forum

Non-source text should be kept as it is in the Target

Hi,

I am presently using the in-built opennmt model fro eng-ger translation. But when there are
non-english character like japanese, or german,the output gets gobbled. How could I enable a option that keeps non-english characters as it is in translated text.

Please find the below output :
INPUT ==> {‘text’: ‘00003 AUTHOR. (05.04.07) スズキ カズヒコ. LV049\n’},

OUTPUT ==> {‘Text’: '00003 AUTHOR (05.04.07) Streife Herr Gescheigene Geschößerscheinhöfe fallen zufällig.

Please let me know if there is any option which could be used to achieve this ?

Thank You,
Kishor

Hi,

There is no such option in OpenNMT-py. This should probably be handled in a preprocessing step.

Hi,

How do we handle this in preprocessing step?

Please provide me the link for this if there are any.

Thank You,
Kishor.

There is no resource in OpenNMT on this subject.

Maybe in your application, verify the language of the input and if it does not match the model, do not send the request?