For one particular engine (Eng2Dutch) I've been investigating the relationship between translation quality and the length of the input sentence. For sentences with up to 30 words I've been getting perfectly constructed Dutch sentences which express the meaning of the input sentence. With sentences longer than 30 words I see grammatical or semantic wobbles. I have used out-of-the-box settings for these tests. Has anyone else experienced this?Thanks.Terence
you may need to try with a deeper network.try first with 3 layers of 500 (vs 2 x 500 default) and 2 layers of 800.
let us know how it goes.
did you measure also with a BLEU score ? with both in-domain and external test sets ?
Well, trying with 4 x 600 brought BLEU up from 29.30 to 32.95 on in-domain material. I have not systematically done anything with external test sets yet (just throwing stuff into my client GUI). Clearly a promising avenue of investigation.
How many sentences are you measuring your BLEU on ?What BLEU toolkit are yo using ?
Also are you using the default 50k vocab size and 50 sequence length ?
Bear in mind that with Dutch you may have to use BPE to avoid huge OOV at inference time.
This was a small test on 2398 sentences. I used defaults for vocab & sequence length. I am largely avoiding huge numbers of OOV by using the -phrase_table option as I have a 300K entry dictionary from an old Rule Based system.
To follow up, I have now used the 4x600 model in a live production setting to translate a complex legal document comprising some very long sentences from Dutch into English. For most translated sentences little or no post-editing was required; in only one sentence did the engine lose its way. This particular configuration seems to work well for this (in-domain) material.