OpenNMT Forum

What is the best approach to deal with numbers in sentences?

Hello,

I am wondering what is the best approach to deal with numeric tokens in the training sentences. I mean, if I have a sentence in the source language with contains a numeric token (integer or floats, such us “1” or “1.5”) and the same numeric token in the target language sentence, what would you suggest me to do? Keep it as numeric tokens, delete them, substitute them with some “unk” token…

Thanks in advance