I have got some aligned passages, and I want to automatically extract sentences from it. Is there any tools that I could use?
I’m a little confused by the question. “Aligned passages” suggests the alignment is done; I’m going to assume this means parallel data that you’d like to align.
If you’re looking for a way to automatically align sentences in parallel corpora, hunalign has proven useful.
If you already have sentences aligned and are looking for word alignment, consider fast_align or mgiza++.
Yes, I’m looking for some tools to align sentences in parallel corpora. Sorry for my poor English and thank you for your help.
some align tools:
Champollion Tool Kit: http://champollion.sourceforge.net/
Microsoft Bilingual Sentence Aligner:https://www.microsoft.com/en-us/download/details.aspx?id=52608
but I think it’s best to write an alignment script for yourself.
Thank you for your help and advice, I have tried to write a script myself, but it doesn’t work well, that’s why I’m here to find the other solution.