OpenNMT Forum

Data set for creating a german - english model

Hi,

Could someone share the link to the data-set which could be used for creating a german - english model?

Thank you,
Kishor,

Dear Kishor,

Most (public) bilingual datasets are available at: http://opus.nlpl.eu/

Otherwise, based on the required domain and purpose, you might find other datasets for software localization from: LaunchPad, GitHub, MS (Terminology & Translations), Apple, Wordpress, Mozilla, LibreOffice, Apache.

Just make sure you read and observe the associated license of each dataset.

Kind regards,
Yasmin

http://www.statmt.org/wmt19/translation-task.html