I found a related question here, but OP decided to use multi-feature instead.
I just read the paper on guided alignment and I think it could benefit our set-up. However, I am not sure about the YAML file here. Should I just split
train_alignments up into two? E.g.:
train_alignments: - alignments_1.txt - alignments_2.txt
Three additional questions:
- from the paper, it would seem that the best results were retrieved by using CELoss and 2:1 weight. If I read the paper correctly, that means
2for the decoder. However, I can’t find the decoder weight in the config file. I then assume that the decoder weight is always
1and that I should use
guided_alignment_weight. Is that correct?
- is null alignment allowed? In other words, can you have a significant number of data that is not aligned (indices are not in the GIZA alignment string)?
- is it possible to only use guided alignment on one source?