Build a Glossary


I was wondering if anyone experienced any way to generate a glossary with OpenNMT?

My goal is to generate a Glossary in order to manually fix it and then use it for training. I’m welcome to suggestion if there are better ways to do that out there!

Did you try fast_align? GitHub - clab/fast_align: Simple, fast unsupervised word aligner

1 Like

I have not, but I will give it a shot :slight_smile: thanks!

I have a Wiktionary scraping script for dictionary data.

Thank you, but I need to build a glossary or custom dictionary based on my own data :stuck_out_tongue:

But I keep in mind your script. It could be handy in some situation!

1 Like