ONMT net structure?

(Etienne Monneret) #1

Is there a document describing precisely the whole structure of the ONMT net, and each part of it? Especially, at this moment, I would enjoy to learn more about the attention part of the net. How is it preciselly built? Are there some parameters to tune its structure and functioning?

(srush) #2

We do have some documentation in each of the modules, for instance attention is here

Maybe you would like something more high-level?

(Etienne Monneret) #3

Perhaps, a bit more explanations would be a first step…

(srush) #4

Sure, I was thinking of writing a series of tutorials about the internal structure of these models over spring break. I may ask you for feedback.

(Pltrdy) #5

This would be super interesting!

(Nikhil Verma) #6

Oh, this would be so much helpful for people wanting to implement new ideas! I’m currently struggling with the same.