As a disclaimer, I am relatively new to ML, NLP and OpenNMT, but I of course have a strong desire to learn. I’ve learned about LSTMs, and now want to peer into OpenNMT’s implementation of them.
I am interested in understanding, from an implementation perspective, how the underlying source uses different models during things like training. In particular, I am interested in the LSTM model being used (the specifics of it, as the documentation doesn’t give much details) and how it is implemented (and then used).
I have found the
LSTM.lua file, but it seems like this file is only for building the units and stacking them (to make blocks) and to build layers out of those blocks. In other words, it seems like this file contains only code for building the model, not actually running training/inference/etcetera.
What I can’t seem to find is the code in which this model is actually used in training and/or predicting.
Please let me know if I’ve been incorrect in anything I’ve said, and of course, if you could provide any insight on how the code is working! Resources or tips for how I can wrap my head around the codebase in general would also be greatly appreciated! And, of course, if this is a duplicate, feel free to redirect me!