Extracting word alignment from translation models

Narfi · July 11, 2023, 10:26am

Hi! I am trying to extract alignments from translation models. I have gone thru this paper for extracting word alignments.
Is this implemented in CTranslate2 or any plans to implement it?
Can you also suggest some alternates if any.
Your help is greatly appreciated, thanks.

guillaumekln · July 11, 2023, 11:34am

Hi,

Many training frameworks support learning word alignments using one or more attention heads. I know that OpenNMT-py, OpenNMT-tf, Fairseq, and Marian have such options. Search for “guided alignment” or “alignment layer”.

These options are correctly applied in CTranslate2. After converting the model you can enable the translation parameter return_attention=True which will return the attention vectors that were optimized during the training. The word alignments can be inferred from these attention vectors.

Jourdelune · July 12, 2023, 1:03pm

Hello, I use score_batch from ctranslate2 to get word alignments, but I use it to follow this research paper: https://aclanthology.org/W18-6314.pdf. Here is an example for you:

results = translator.translate_batch(
    [input_tokens],
    return_scores=True,
)

output_tokens = results[0].hypotheses[0]
scores = translator.score_batch([input_tokens], [output_tokens])  # here is your word alignments

Zaowad · January 23, 2024, 6:52pm

I need help to extract word alignment information from ctranslate2.
scores = translator.score_batch([input_tokens], [output_tokens])
How does this line generate word alignment? I get is this output:
[ScoringResult(tokens=['▁Quel', '▁est', '▁le', '▁bénéfice', '▁prévu', '▁pour', '▁le', '▁troisième', '▁trimestre', '▁de', '▁2024', '▁?', '</s>'], log_probs=[-0.5299587249755859, -1.1223669052124023, -0.23002052307128906, -0.7606239318847656, -1.411423683166504, -0.7082910537719727, -0.10277175903320312, -1.1579875946044922, -0.10407352447509766, -0.10088253021240234, -0.041037559509277344, -0.0836639404296875, -0.09623146057128906])] (1, 12, 13)