I am building a machine translator. My dataset has 90,000 sentence pairs. Should I use Transformer to train the machine translator with such a small dataset? Will it be better than using LSTM?
Thank you.
If this is a research work, I assume you can try both. Consider also the possibility of fine-tuning a (pretrained) multilingual model, especially if your language pair is similar to some high-resource languages.
Check also some posts on the forum about low-resource languages, including this one.
Kind regards,
Yasmin
3 Likes
Thank you so much for your help.
I would also recommend fine tuning a model for a larger similar language if possible. I trained an Azerbaijani model by fine tuning from Turkish with good results.
2 Likes