ONMT net structure?

Is there a document describing precisely the whole structure of the ONMT net, and each part of it? Especially, at this moment, I would enjoy to learn more about the attention part of the net. How is it preciselly built? Are there some parameters to tune its structure and functioning?

We do have some documentation in each of the modules, for instance attention is here http://opennmt.net/OpenNMT/code/modules/onmt+modules+GlobalAttention/

Maybe you would like something more high-level?

Perhaps, a bit more explanations would be a first step…
:grimacing:

Sure, I was thinking of writing a series of tutorials about the internal structure of these models over spring break. I may ask you for feedback.

2 Likes

This would be super interesting!

Oh, this would be so much helpful for people wanting to implement new ideas! I’m currently struggling with the same.